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“I hail Bagley’s™book_as the most important 
offering of the year. ut you who are afraid 
that I boil over too easily, you who want to follow 
a master reasoner careful of his route, step 
step, read Bagley on Determinism even 

its _ appendix.” Educational Review, 

1926. 
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heretofore published by the author have bees 
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WHAT SHALL BE TAUGHT IN EDUCATIONAL 
PSYCHOLOGY? 


GOODWIN B. WATSON 


Teachers College, Columbia University 


The present status of the teaching of educational psychology, 
in the light of the contributions which it has made to the selection of 
subject-matter and the development of method in other fields, prompts 
the time-honored suggestion: ‘‘ Physician, heal thyself !’’ 

Educational psychology has appeared to be crystallizing. That 
would be less serious, if it were certain that the crystalline organiza- 
tion would embody the materials of most worth for the training of 
educators. Judd, yielding to no one in his interest in psychology, 
has recently stated his firm conviction that nobody really knows what 
psychology teachers ought to study. Dean Russell has mentioned 
more than once his belief that educational psychology has built up, 
through careful research, a body of subject-matter from which the 
best selection for teaching purposes has not yet been made. The 
advocates of professionalized subject-matter in teacher preparation 
suggest that if each teacher of special subjects in teachers colleges 
would incorporate in his own course the relevant psychology, then 
perhaps no general course in educational psychology would be needed. 
Some expect to see educational psychology as a systematic study 
dropped from normal schools as history of education has been. 

In part, the criticisms arise from faulty teaching method. The 
principle that material should be learned in the form in which it is to 
be used has been more preached than practiced by educational psy- 
chologists. It may be that educational psychology taught in system- 
atic fashion is likely to contribute to teaching skill no more than does 
formal grammar to skill in speaking and writing. If anatomy is most 
profitably learned in connection with surgery and dissection, if the 
proper approach to legal principles is through cases, if one never 

577 


Ee = “ 



































578 The Journal of Educational Psychology 


attains real insight into statistics except as procedures are evaluated in 
the light of specific research problems, then it may be that educational 
psychology, in order to influence the thought of educators most effec. 
tively, should be approached through the tangled situations to which 
its laws are supposed to make a contribution. 

Present criticisms bear even more strongly upon the selection which 
has taken place among the many elements which might be included in 
educational psychology. 

The items to be studied in educational psychology may be selected 
in at least four ways. The most common procedure has been to utilize 
the judgment of a man skilled in psychological research. This might 
be expected to lead to the preeminence of those elements in educational 
psychology upon which most research has been done. In part, these 
represent interests carried over from general psychology. Moreover 
most research has been done, not upon the most important problems, 
‘but upon those which are most easily controlled and subjected to 
scientific investigation. Hence we find mirror-tracing and nonsense 
syllables given more place in some tests than is awarded to handwriting 
or the mastery of a foreign language. It must not be concluded that 
the research activities have gone astray. Perhaps they have at some 
points, but that is not the suggestion of this study. The research 
worker should not be limited to the study of problems which educa- 
tors believe to be immediately helpful. His search for general truth, 
valuable in remote ways, may well be unhampered. The argument 
here is that the results of such endeavors, to date, do not determine the 
best selection of materials and principles of organization for a course in 
educational psychology. 

A second procedure would be the evaluation of subject-matter in 
the light of its contribution to professional success. This awaits the 
development of adequate criteria of professional success. Once these 
are obtained, anyone can discover what psychological training is 
used by good teachers, supervisors, and administrators, but not with 
like facility by poor teachers, poor supervisors, and poor adminis- 
trators. Somewhat less satisfactory would be a third basis of selection 
which would utilize job-analyses. If it were known just what are the 
central and marginal responsibilities of a high school Latin teacher or 
elementary school supervisor, then shrewd psychologists might make 
excellent selections from their wares to meet these needs. So likewise, 
upon the basis of job-analyses of each of the manifold other educational 
functions, selection might proceed. 
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A fourth possibility has been utilized in this study because it seems 
immediately practicable. If the available materials of educational 
psychology be translated into the terminology of the educational 
problems to which they contribute, what would be the importance of 
those problems in the judgment of people who ought to know? The 
opinions of three general classes of people have been studied. « A 
group of experts in the training of educators has been found inthe 
faculty of the school of education at Teachers College. A group of 
experienced teachers, supervisors and administrators has been found 
in certain graduate courses in educational psychology. A group of 
intelligent undergraduates who have never taught but are preparing 
for teaching, contributed the third point of view. dt yk 

The questionnaire which follows was formulated by the writer 
with the cooperation of Mr. Ralph B. Spence, Instructor in Educa- 
tional Psychology at Teachers College. Each topic which had been 
included in texts, courses, and examinations in educational psychology 
in common use, was challenged with the question, “‘ What difference 
does this make in the business of education?” The resulting specific 
questions were grouped under 15 general topics, with some additional 
illustrations. When the questionnaire was given to the 400 subjects, 
each person was asked to suggest other topics which might be added. 
All suggestions which did not seem to be clearly involved in some 
existing question were added to the list, and are embodied in the ques- 
tionnaire as given here, but marked with an asterisk. A more sys- 
tematic check against the contents of certain psychological texts used 
in schools of education revealed that while the problems treated by 
those texts appear in somewhat unfamiliar garb in the questionnaire, 
yet there were few, if any, problems mentioned in the text which were 
not directly or indirectly concerned in some of these concrete problems, 
The list is probably complete enough for a good working basis. 


An EVALUATIONZOF PROBLEMS IN THE FIELDJOF EDUCATIONAL 
PsYCHOLOGY 


Directions.,—Educational Psychology should provide help on the problems upon 
which educators are most anxious for help. This survey is an endeavor to find out 
the importance which you would place, as a result of your experience, upon each 
of the problems suggested below. 

Rate each major division (underlined) on a scale of 1-10. Thus: 

10 represents extreme importance, utmost value, highest interest 
9 represents very great importance, value, and interest 





























| 580 The Journal of Educational Psychology 


8 represents much importance, value, and interest 

7 represents considerable importance, value, and interest 

6 represents more than average importance, value, and interest 
5 represents just average importance, value, and interest 

4 represents less than average importance, value, and interest 
3 represents little importance, value, and interest 

2 represents very little importance, value, and interest 

1 represents practically no importance, value, and interest 

Do not rate the sub-questions. They are merely unarranged examples and 
explanations of the type of question included in the major division. 

Whenever you think of a sub-question which you feel would be important, 
valuable, and interesting, please write it in the blank space at the close of the 
major division in which it should be included. 

If you think of some whole set of problems not included here, but which would 
be of real value to educators, from a psychological point of view, please list it as 
another major division, at the end of the survey list. 





I. Problems of Original Nature, Heredity, and Environment. Rating......... 

1. What are the dominant human urges, wishes, drives, ‘psychological 
pressures?”’ 

2. How far are social ills (e.g., war, prostitution, race discrimination, 
povery, disease, crime, etc.) rooted in original nature, and how far 
are they produced by undesirable educational factors in the 
environment? 

*3. How far is the educator limited by the nature of the child at birth? 
How far is artistic achievement, or moral character dependent upon a 
pre-disposition ? 

*4. Does education gradually improve the intelligence of the race? 


II. Problems of Personality Adjustment for Teachers and Pupils. Rating...... 
1. What are the principal personality difficulties which lead teachers and 
superintendents to be regarded as failures? How are these caused? 
How cured? 
2. What causes the following traits in pupils or in teachers? How should 
they be handled? 
(a) Indifference. 
(b) Bossiness. 
(c) Dependence. 
(d) Sense of inferiority. 
(e) Fear of failure. 
(f) Feeling of persecution. 
(g) Hostility to new ideas. 
(h) Egotism. 
3. How can one develop a better sense of humor? 
4. What is the origin and nature of conscience? 
5. What is the cause of, and the best method of dealing with disciplinary 
problems such as: 
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(a) Cheating. 

(b) Lying. 

(c) Stealing. 

(d) Contrariness and obstinacy. 

(e) Cruelty. 

(f) Unconventional sex behavior. 

(g) Desire to annoy the teacher. 

(h) Bullying. 

(i) Bad temper. 

(j) Reckless wildness. 

What tends to bring about in pupils ideals such as: 

(a) Neatness. 

(b) Honesty. 

(c) Courage. 

(d) Unselfishness. 

(e) Altruism, benevolence. 

(f) Patriotism. 

(g) Purity. 

(h) Self control, temperance. 

What conditions tend toward happiness in lifey Can school be a 
genuinely happy place for children? 

Should the emotions be controlled, or expressed freely? 

Of how much use is psychological prediction i in vocational guidance? 

What is the place of a school psychologist? Of a personnel advisor in 
higher education? *Visiting teacher? 

What are the psychological factors involved in direct ethical instruc- 
tion? Can it be wisely done in public schools? 

Can a teacher be too friendly to retain the respect of pupils? 

What is the psychology of ‘‘leadership?”’ 


III. What are the Outstanding Interests of Children at Different Ages? How Can 
These Best Be Related to Education? 


1. 


go 
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Are there predominant interests among: 

(a) Pre-school children? 

(b) Children in Grades I to VI? 

(c) Junior high school pupils? 

(d) High school pupils? 

(e) College students? 

(f) Graduate students? 

How can interests best be discovered? Developed? 


. In how far are modern ‘“‘flappers”’ and “‘sheiks” different from the 


young people of other generations? Why? 


. Are interests different in different sections of the country? How 


different in rural and urban communities? Why? 


. In what way, if at all, should rewards and punishments be used? 


What is the relation of perseverance, strong will power, etc., to interest? 
Should pupils ever be compelled to study things they find distasteful? 
How far can children plan thier own programs? , 
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IV. Problems of General Teaching Method. Rating. ...................,,. 
1. What are the psychological advantages and disadvantages of teaching 
by each of the following methods? Under what conditions is each 
best? 
(a) Lectures? 
(b) Textbooks? 
(c) Committee investigations? 
(d) Group discussion? 
(e) Teacher-controlled enterprises? 
({) Projects? 
(g) Recitations? 
(h) Memory work? 
(t) Drill? 
(7) Review? 
(k) Quizzes and examinations? 
(lt) Case methods? 
2. In general, what kind of classroom procedure will make for: 
(a) Initiative? Resourcefulness? 
(b) Scientific attitude of mind? 
(c) Good judgment? 
(d) Social adaptability? 
(e) Accurate and objective self-appraisal? 
What is the customary thinking process? How can this be improved? 
How strengthen memory? Imagination? 
Psychologically, what are the conditions for effective study? 
Can pupils be made to enjoy work? 
Should children help choose the teaching method? 


* 
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V. Problems of Teaching Methods for Special Subjects. Rating.............. 
1. What particular psychological factors must be considered in evaluating 
special methods for teaching: 

(a) Reading. 
(b) Arithmetic. 
(c) Foreign language. 
(d) Social science. 
(e) Physical training. 
(f) Writing. 
(g) English. 
(hk) Appreciation of art. 
(t) Psychology. 
(j) Religion. 
(k) Spelling. 
(1) General Science. | 
(m) Music. 
(n) Household arts. 

2. Do special abilities and disabilities exist? How should such cases be 

diagnosed? ‘Treated? 
*3. How can one keep abreast of literature on teaching special subjects? 
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‘VI. Problems Involved in Selecting Curricula and Texts. Rating 


1. 


2. 
*3. 
4. 


VII. Problems in the Development of Skills. Rating 


1. 
2. 


3. 


VIII. Problems of Measurement. Rating 
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9. 
*10. 


IX. Individual and Group Differences. Rating 


i 


9 
~ 


3. 
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5. 
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What is the psychological validity of curricula based upon: 

(a) Stages in child development? 

(b) General strengthening of the mind? 

(c) Activity analyses? 

(d) Judgment of educators as to what children ought to have? 

(e) Immediate life interests of children? 

What, psychologically, are the characteristics of a good textbook? 
“Electives” vs. ‘‘Required Courses”’ in high school. 

Text vs. reference work, outside investigations, etc. 
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Under what conditions does practice make perfect? At what rate? 
What is the ‘‘ proper balance”’ of theory and practice in: 

(a) Teacher-training? 

(b) Athletic coaching? 

(c) The development of character? 

(d) Typewriting? 

(e) Engineering? 

(f) Medicine? 
*(g) Practical arts? 

What habits should be mechanized? 


How can original nature, inborn ability, be measured? 

How can achievements in school subjects be measured? 

How can skills be measured? 

How can attitudes and habits of mind be measured? 

How can personality traits be measured? 

How useful are the different types of examination? 

What sort of marking system is psychologically valid? 

What is the place of measurement in education? Is it overstressed 
now? What dangers are there in it? 

What are the most valuable by-products of measurement? 

How much attention, what proportion of time, should be given to 
measurement? 


Under what conditions are special classes desirable for certain groups of 
children? 


. What are the advantages and disadvantages of co-education? What 


are the sex differences, psychologically? 

What are the psychological differences between nationalities and races? 
Do these justify discrimination? 

Individual differences being as they are, is democracy psychologically 
justifiable? 

What changes should be made in ordinary procedure, to fit the special 
needs of sub-normal and super-normal children? 

Is the Dalton plan advisable? 

Are there standards to which we hope all children will attain? 
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X. Problems Involving Extra-curricular Activities. Rating....... 
1. What are the psychological values and dangers in: 
(a) Athletics? 
(b) Clubs? 
(c) Plays and dramatizations? 
(d) School dances? 
(€) Debate? 
(f) Contests? 
(g) School papers? 
etc. 
2. Under what conditions does recreation take place most effectively? 
3. What is the influence of stories, plays, movies, upon children? 
4, What is the psychological contribution of art: painting, sculpture, 
music, poetry, interpretive dancing, etc. in school life? 
*5. Is over-stimulation a real or imaginary danger? Should limits be 
placed upon participation? 
*6. What are the psychological consequences of censorship? 


XI. Problems of Inter-relationships and Transfer of Training. Rating......... 

1. How far is it possible for the school to counteract or to improve the 

psychological influence of: 
(a) Home? 
(b) Gang? 
(c) Community attitudes? 
(d) Recreations? 
(e) Industrial order? 
(f) National policies? 
(g) Religious teachings? 

2. How much carry-over is there from one class to another? How much 
from school to groups outside? Under what conditions does trans- 
fer take place? 

*3. Is there any place for learning which will not transfer to other life- 
situations? 

*4. Is “citizenship” capable of development within the school room, 
regardless of the community situation? 


XII. Problems Relating to the Home as an Educational Institution. Rating..... 
1. Under what conditions is a marriage likely to be successful? 
2. How far is a eugenic program justifiable, psychologically? 
3. What sort of child training in pre-school years leads to the best resu!ts? 
What habits, specifically, should be formed? How? ‘*Is nursery 
school preferable to home? 
*4. What use can parents make of tests and test results? 
*5. What sex education is advisable in schools? At home? 


XIII. Problems Involved in Dealing with Adults. Rating..................... 
1. What are the most effective methods of integrating communities in 
which the people are divided into factions? How handle schisms, 

feuds, religious conflicts? 
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2. How can new ideas be best introduced into a conservative school 
constituency ? 
What are the best methods for dealing with radical or reactionary 
prejudices among teachers or community leaders? 
How best deal with irate parents? 
What forms of advertising are most effective? 
What makes a teacher “popular” or “unpopular” in a community? 
How should a superintendent choose his teachers? 
Under what conditions is supervision most likely to be useful to teach- 
| ers? 
| 9. How may morale be built up in a teaching staff? 
*10. How long should a teacher stay in one place? 
*11. How handle unsportsmanlike spectators at games? 


a 
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XIV. Problems of the Interaction of Physical and Psychological Factors. Rating.... 
1. What is the effect upon mental activity, of: 
(a) Aleohol? 
(b) Coffee? 
(c) Tobacco? 
(d) Drugs? 
(e) Rest and fatigue? 
(f) Sleep? 
(g) Diet? 
(h) Ventilation? 
| (7) Posture? 
(j) Exercise? 
(k) Glandular secretions, and their disorders? 
(tl) Sensory defects? 
2. Under what conditions do mental attitudes affect physical ills? 
*3. What is a desirable physical environment for study? 


XV. Problems Involving Psychological Schools and Theories. Rating. . : 
1. What are the outstanding emphases in the psychological ohaieeaians of: 
(a) Behaviorists? 
(b) Introspectionists? 
(c) Psychoanalysts? 
(d) Psychiatrists? 
(e) Associationists? 
(f) The “Gestalt Psychologie.” 
(g) The ‘‘ Faculty” psychology? 
(h) The “New” psychology? 
(i) Structuralists? 
(j) Functionalists? 
(k) Mechanists? 
(1) Vitalists? 
2. What is meant by a scientific viewpoint and method in psychology? 
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3. What are the advantages and disadvantages of such methods of psycho. 
logical study as: 
(a) Uncontrolled observation? 
(b) Controlled observation? 
(c) Questionnaires? 
(d) Tests and scales? 
(e) Case studies? 
(f) Physiological experiments? 
(g) Experiments with animals? 
(h) Investigation of practices of primitive groups? 
4. Is the concept of instinct desirable? If so, in what meaning? How 
did such “instincts’’ develop? 
What should we understand by ‘‘sensation,”’ ‘‘percept,”’ “concept?” 
6. What should we understand by ‘‘intelligence,” ‘‘emotion,” “be. 
havior?”’ 
7. What should we understand by “unconscious,” ‘‘libido,’”’ “traumatic 
fixation,’ ‘‘symbolization,’”’ “‘compensation?”’ 
8. What is the relationship of mind and brain? 
9. What have been the principal contributions of outstanding psychol- 
ogists in the past? Of living psychologists? 
XVI and following: To be added below: 


=o 


In September, 1925, this questionnaire was sent to a selected 
30 members of the faculty of the school of education, one person being 
chosen to represent each vocational interest. Seventeen complete 
replies were received, with valuable suggestions from others. On the 
first class session in five courses in educational psychology the question- 
naires were filled out. Group A is composed of junior and senior 
undergraduates in Columbia College and Barnard College; 96 per 
cent of them have had no teaching experience, 88 per cent of them are 
expecting to teach in secondary schools. Their intelligence scores on 
the Otis S-A Higher Examination average almost one quartile higher 
than do the scores in any of the other groups. Group B is composed 
of undergraduates taking a Saturday morning course in educational 
psychology. Most of them are experienced teachers and adminis- 
trators who are acquiring credits toward a degree, while engaged in 
teaching during the week. Group C is a miscellaneous group of 
students who are outside the beaten paths. The largest number of 
them are interested in art, music, and applied arts. Others have had 
experience as directors of physical education. Their median on the 
Otis test falls below the lower quartile division of Group A. Groups 
D and E represent the large group of regular graduate students in 
education. In Group D we find some persons coming in for a Saturday 
morning course while carrying on field responsibilities. The following 
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survey of interests in this group would correspond very well to Group E 
also. 


Per Cent 
BE IE WON oi ceccaceccesesspisecaceeesgudheseean 28 
I «606s 6p-b-e a5 ab eicdad ec chanced scen@nasheniseepe 17 
College teaching or administration......2.................... 14 
Administration of public schools............................. 10 
Supervision of elementary schools....................000005: 9 
cv cch ed enced oa we bes ewee ene 6 
Clinical psychology, vocational work, etc..................... 6 
ea ob od wide 64 4.44.24 RS SSO ee a SEN 6 
PR CI 6 doko cca asain ss a¥dasbedesess cageéawanuns 5 


The first part of Table II presents a summary of the replies. 
Figs. 1 to 15 in the left hand margin correspond to the general divi- 
sions of the questionnaire. Thus I represented problems of original 
nature, heredity, and environment, II problems of personality adjust- 
ment for teachers and pupils, etc. The first column gives the median, 
mode, and semi-interquartile range for the 17 faculty ratings. The 
columns headed A, B, C, D, and E, correspond to the groups of stu- 
dents who rated the problems in the fall. 

The points of agreement and difference are brought out more clearly 
in Table II, in which the 15 topics are ranked in accord with the aver- 
age value assigned by the various groups. Where medians coincided 
means were used to eliminate as many as possible of the ties in rank. 
The agreement appears to be unexpectedly high. It seems clear that 
the student groups are more like one another than they are like the 
faculty. The faculty differ in showing slightly more concern for prob- 
lems of general teaching method, curricula, and texts, the development 
of skills, and the study of individual differences, but rather less interest 
in the study of children’s interests, in problems of personality adjust- 
ment, in the influence of physiological factors, and problems of home 
making and pre-school child training. 

Probably the conversion of the ratings into ranks distorts the 
emphasis somewhat. The distributions were neither rectangular nor 
normal, but skewed toward the high end with a few low stragglers. 
On the more important topics the median fell within the group rated 
as “10” in value, and is recorded in Table I as 9.50. The lowest 
composite median was 6.42, the difference between these extremes being 
only two or three times the size of Q for any distribution of ratings 
given to a single topic. In the middle of the scale of topics, from the 
viewpoint of value, the difference between medians, which is trans- 
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lated as a total step in rank, was frequently less than one-fiftieth of the 
Q for either distribution. The surprising fact is that with the scale go 
abbreviated, the differences so slight, the proportional unreliability of 
differences between certain topics so high, that there was so much 
agreement in rank. 

Table III presents the inter-correlations (rho method transmuted 
to “r’’) between the ratings given by each group and those given by 
every other. Groups A to E had an average agreement among them- 
selves at the beginning of the year of .74. The average correlation 
between the faculty order of choice and the order preferred by student 
groups was .67. Group C, as might be expected, differs most sharply 
from other groups, due in part to the unusual value which these physical 
education and practical arts students placed upon the study of physio- 
logical factors, and the study of problems of home-making, with a cor- 
responding lack of concern for individual differences or problems of 
curricula and texts. 

While the ratings given to each of the topics on the questionnaire 
at the beginning of the term formed the basis for a syllabus which was 
useful during the semester, it seemed wise to study the extent to which 
opinion changed during the course. Did students come to a realiza- 
tion that the problems they had supposed to be of major concern were 
really of very much less value than they had anticipated? Did further 
study open up to them possibilities they had not seen at first? Groups 
A, C, and E, finished their work in educational psychology in one 
semester. They were given the questionnaire again, at the very end 
of their course, with the suggestion that they indicate how they would 
rate each item for students who might take the course next year. 
It was suggested that they might have changed their minds about some 
of the elements. The questionnaires here, as at the beginning, were 
entirely anonymous and understood to be wholly unrelated to the 
student’s class standing. 

A summary of the ratings given by these three groups at the end 
of the term may be found in the sixth, seventh, and eighth columns of 
Table I. In Table II the rank position assigned each of the problems 
at the end of the semester study of educational psychology is indicated. 
Inspection shows that the order does not differ markedly from that 
set forth at the beginning of the term. Group A places less value on 
problems connected with the home, and with adults. Group C has 
come to a realization that problems in the discovery of interests at 
various ages are not so important as supposed, while problems in the 
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development of skills are given more consideration. Group E has 
come to value problems of measurement more highly. 

Table III shows correlations between the order at the beginning 
and that at the end of the terms, of .90, .64, .86 for groups A, C, and 
E, respectively. This is slightly higher agreement than existed 
between one group and another at the beginning of the year. It is 
interesting to note further that the average agreement between these 
groups at the beginning of the year was .72, that at the end of their 
study .79. ‘There may have been a slight tendency for groups to 
become more alike as they studied. 

It seemed that these groups might show undue agreement with 
themselves because they had done most study upon the topics given the 
highest rating. They might agree with their original ratings, largely 
because of the set-up of the course to meet those original ratings. 
Group F was therefore also studied. Group F is made up of students 
very much like those in groups D and E, except for the fact that it is 
taught in a more traditional content and method. Group F had no 
questionnaire at the beginning, and no modification of method in the 
direction of any of the topics suggested. It was an excellent old-line 
course in educational psychology taught by an able teacher. If these 
graduate students with teaching experience tend, after study of the 
traditional subject-matter, to agree with the groups formerly studied, 
then it may be supposed that a general agreement exists, independent 
of the suggestions offered by a particular instructor. 

The columns headed “‘F” in Tables I, II, and III show the average 
ratings given by this group, the rank order of the topics in their 
judgment, and the correlation of this order with that set by the other 
groups. The average correlation between the interests of this group 
which had not been influenced by instructors interested in the topic 
evaluation, and the interests of the experimental groups which had 
spent a semester in the study of the topics they themselves rated high- 
est, was .81 or even higher than the average agreement among those 
experimental groups. 

It seems clear that there is a large amount of agreement among 
these 622 votes by persons interested in education, showing the relative 
value of the topics as suggested in the questionnaire. Faculty mem- 
bers, experienced teachers, administrators, and under-graduates who 
have never taught, all of them agree fairly well upon the emphasis they 
would like in an educational psychology course. This concurrence is 
relatively little influenced by a semester of study, either of the course 
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they seem to prefer or of the traditional course. It may be that this 
apparent agreement is due to the form of the questionnaire and the 
choice of words within it. Perhaps there is a tendency to rate higher 
the topics which were elaborated more fully, or those near the begin- 
ning. It is planned to check this influence by the formulation of 
another survey blank in which the topics shall be expressed in different 
words, arranged in different order, and accorded equal space for 
elaboration and illustration. 

Whether the agreement be due to the form of the questionnaire 
or to the real differences in value or to both, it is interesting to compare 
the choices made by educational psychologists with the choices 
made by these other groups of educators. A weighed composite for 
the Teachers College groups was made by weighting each group average 
in accord with the number of students in the group. Only “end of 
the term”’ ratings were used for groups by which the topics were rated 
twice. Each faculty judgment was counted as equivalent to five 
student judgments. The composite ranks, both when weighted and 
when unweighted, are presented in the eleventh and twelfth columns 
of Table II. The correlations of each group with these composites 
may be found in Table III. The Teachers College groups yield an 
average correlation of .84 with the weighted composite and of .85 with 
the unweighted composite. 

The same questionnaire was sent to 15 active leaders in the field 
of educational psychology. Eight of them in eight different univer- 
sities responded by rating the suggested problems. The mean of their 
ratings is given in the farthest right hand column of Table I. In 
Table II the order which they would assign to the topics is shown in 
the last column. In Table III may be found the correlations between 
this order of emphasis, and that made by each other group. They 
range from .09 to .66, the latter being, as might be expected, the 
agreement with the group which had studied the most traditional 
content in their educational psychology. The correlation of the educa- 
tional psychologists with the composite of the other groups was only 
47 or .50. Either the form of the questionnaire affected the educa- 
tional psychologists very differently, or they have a different perspec- 
tive and emphasis as they look at the field of possible problems. More 
careful scrutiny of Table II suggests that the educational psychologists 
are much more interested in the psychology of skills, in measurement, 
in schools and theories, and in the psychology of special subjects. 
They seem to see distinctly less value in teaching the psychology of 
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group inter-relationships, extra-curricular activities, general educa- 
tional method, or the selection of texts and curricula. 

The next problem confronted was that of determining the amount 
of course time to be given to each topic in the light of the values 
assigned. Here some assumption as to the zero point or the relative 
value of best and poorest topics, is essential, If ratings are to be 
regarded as meaning literally their word equivalents in the “ Direc- 
tions,” and a zero value on the scale is to be taken as equivalent to 
zero time in the course, then each topic should be given practically 
as much attention as any other. The one rating highest would occupy 
8 per cent of the time, the one rating lowest six per cent of the time. 
In the judgment of the instructors, the difference of opinion was far 
more significant than that. In texts it was found that the topic given 
most prominence, among these listed, had often more than 20 times as 
much attention as the one given least prominence. It was determined 
to assume that the topic placed at the head of the list for “‘impor- 
tance, value and interest”’ should be given approximately 10 times as 
much attention as the one placed at the foot of the list. 

The resulting apportionment in terms of per cent of the total 
course time, among these 15 topics is given in the first column of 
Table IV. The time apportionment on the basis of these judgments 
is there compared with the per cent of time allotted in the text books 
most frequently used in courses in educational psychology in colleges 
and universities. Douglas in The Journal of Educational Psychology 
for September, 1925, published an article entitled ‘‘ The Present Status 
of the Introductory Course in Educational Psychology in American 
Institutions of Learning,” in which questionnaires sent to 73 institu- 
tions revealed that, in 1923-24, Starch ‘‘ Educational Psychology”’ was 
reported as a text 25 times, Gates “‘ Psychology for Students of Educa- 
tion” 18 times, and Strong, ‘Introductory Psychology for Teachers” 
13 times. These first three on the list represented 35 per cent of the 
total number of textbook mentions. Changes have undoubtedly taken 
place since, Gates’ book having been published only that year. Analy- 
sis was made of these three books to ascertain the extent to which they 
agreed with one another and with the criterion established through the 
judgments on this opinion questionnaire. Each page of each book was 
studied and classified by the writer as contributing “directly” or 
‘indirectly’ to one or more of the 15 topics. A single measure of the 
attention given by each group to each topic was then obtained by 


calling five pages of “indirect”’ reference equivalent to one page of 
direct discussion. 
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Of course, this involved a large amount of personal judgment 
as to how any given page should be classified. Two other persons 
familiar with the questionnaire and with the texts were good enough 
to make independent appraisals. In Table IV the criterion column 
represents the composite judgment of value. The second column 
represents the time which would be allotted by the average educational] 
psychologist, based on the 8 who reported and computed with the 
assumption stated above that the topic rated highest should receive 
about 10 times as much attention as the topic rated lowest. Under 
“Starch” are given the percentages of space devoted to each topic, in 
the judgment of two independent persons, X and Y, together with the 
combination of their judgments. Gates’ book was classified by X, 
Y and Z independently; Strong by only X and Y. Table V shows the 
reliability of these estimates in terms of the inter-correlations. Gates’ 
book proved harder to classify than that of either Starch or Strong, 
partly because his treatment of the physiological basis of learning and 
of learning by analysis, etc. could be attributed indirectly to almost 
every topic in the questionnaire. ‘Table VI gives the intercorrelations 
of these time divisions set forth in Table IV. It indicates that there 
is practically no correlation between the ‘‘value”’ order and the time 
distribution in Starch or Strong, but a correlation of .47 with the 
educational psychologists and with the time distribution in Gates. 
On the whole the books tend to be more like one another (.41) than like 
the criterion of value (.17). The educational psychologists, none of 
whom were authors of these texts, tended to agree with the criterion 
little better than did the texts. The fact of difference in emphasis 
between educational psychologists and other educators such as those 
represented at Teachers College seems unavoidable. 

Table IV is probably more significant than are the correlations. 
It shows that each text has its major line of emphasis, Starch giving 
attention mainly to special methods, and measurement; Gates to 
general method and original nature; while Strong stresses physiological 
factors and measurement. In general the professional educational 
psychologists would give more attention than other educators would 
like to see them give to the psychology of skills, to the psychology of 
special subjects, to measurement, and to schools and theories. They 
would seem to under-emphasize the psychology of group inter-relations, 
extra-curricular activities, the selection of texts and curricula, and the 
adjustment of conflicts in the emotional or personality field. 

The difference between texts and criterion is unfortunately larger 
than it seems. Thus while many educators are interested in such 
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“problems involving adults” as the psychology of school advertising, 
handling irate parents, bringing about unity of community purpose 
and similar projects, the only direct discussion of relations to adults 
in the psychology textbooks deals with the ability of adults to memo- 
rize, or the tendency of adults to rationalize, or the appearance of certain 
instincts in adults. Yet these are all classified alike under the general 
topic of adult relations, and there appears to be agreement between 
texts and criterion. Again, it would look as though curricula and 
texts received fair treatment, so far as space in texts is concerned. 
Yet the actual analysis reveals that the pages so classified in the 
textbooks dealt very little with such fundamental problems as the 
psychological factors involved in building curricula (with the excep- 
tion of spelling words) upon bases of the needs or interests of adults or 
children. No discussion of text construction or selection, as such, 
may be found in any of these texts. 

The figures tend to conceal rather than to exaggerate the real 
differences in selection and in treatment between the values set by 
educators and the topics treated in texts. If these differences in 
emphasis were taken into account it is probable that there would not 
be better than zero correlation between the emphasis given by popular 


textbooks to a long list of items which might be studied, and the 
_ emphasis which experienced teachers and students of education would 
place upon those same elements. 


There are several defenses of the present distribution of attention 
in courses and textbooks which merit consideration. One is that some 
items are more difficult to learn than others, and hence require more 
exposition than do other simpler, but more important items. The 


differences are probably very slight, at best, since each topic is a com-’ 


plex requiring many types of learning. In the absence of any factual 
studies about the difficulty of learning the psychological principles 
applicable to one topic as contrasted with those concerned in a different 
topic, the writer is inclined to believe that those differences are not in 
accord with the textbook emphasis. If these differences do exist they 
are probably in the direction of requiring more time for such varied 
and prejudice-dyed topics as “temperaments” and “extra-curricular 
activities,’ and less for experiments in maze-learning and addition. 

A very important problem is raised by the suggestion that while 
certain elements may not seem very important in themselves, yet they 
may be essential to the best understanding of a wide range of situations. 
The baby’s reflex and the rat’s maze may be units in which love and 
war and human evolution become understandable. Yet even if this 
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were so, there would remain the question as to the real transfer from 
learning about the small units to interpreting the complexes. People 
may not learn to read best by learning the alphabet. The best way to 
understand education may be in terms of studying present actual 
situations rather than more fundamental units and principles. Experi- 
ments at present under way may throw some light upon this problem 
when intelligent adults are involved. Meanwhile it is sufficient 
to point out that the theory of most educational psychologists about 
other subjects is in contrast to their practice in their own subject. 
Another important defense is that while some of these topics 
may be important, little is known about them, whereas about such 
factors as the general divisions of the nervous system or the structure 
of sense organs, it is relatively easy to give truthful information. 
It is possible that the limits of evidence upon a few topics may restrict 
the time given to them to less than is suggested by the importance 
criterion. Surely the converse is less evident. Multitudes of wasteful 
travesties on education could be perpetuated or originated if the mere 
availability of factual data were sufficient reason for giving time to a 
study. That argument would defend describing bricks in houses or 
counting sands by the sea. Moreover, within the limits of an ordinary 
course in educational psychology, containing as did most of these, 
about 50 hours, there are few, if any, of the topics listed in the question- 
naire upon which there is not available at the present moment far more 
reliable experimental material than could effectively be used in the 
maximum time allowed to any one topic, i.e., 744 hours. It must be 
remembered that most teachers would not spend all of that time pour- 
ing forth data. No small portion of the task of any teacher of educa- 
tional psychology upon the basis suggested here, must be spent in 
developing within students skill in analyzing their own teaching prob- 
lems so that the issues upon which factual evidence is useful, emerge in 
their real and natural setting. An important difference between a good 
psychologist in education and a poor one, is that while both may 
be able to repeat the principles of transfer in terms of identical ele- 
ments, the good psychologist sees just how that principle is involved 
in spanking a naughty child, or admitting Jews into a private school. 
The development of that skill and insight, if indeed it be the task of 
educational psychology, can profitably proceed far with a modicum of 
statistical data and controlled experiments, necessary as these are. 
Certain other objections of importance are raised against such an 
outline of content for educational psychology as is here suggested. 
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One is that these results hopelessly confuse the task of policy-maker 
and executant ineducation. There is, however, a significant tendency 
in education at such points as curriculum construction, disciplinary 
problems or parental contacts, to combine these functions. No 
assumption as to the value of either program of teacher-training is 
essential to this procedure. If it were demonstrated that the training 
should be differentiated, this would require simply separation of groups 
and the re-weighting of problems in accord with each interest. 

One of the most serious considerations raised is that of the relation 
of these problems outlined to other fields of education. Some of them 
are obviously administrative, supervisory, or philosophical. Is 
educational psychology to slip into the common but disagreeable 
attitude which claims the whole of education as its private preserve? 
Is it not a waste of time to deal with very similar problems in several 
different courses? 

This difficulty is due to the way in which curricula distort life. 
The decisions made by educators are rarely capable of being classified 
as philosophical decisions, psychological decisions, sociological deci- 
sions, administrative decisions, etc. Is the sullen boy a problem in any 
one of these fields? Clearly these courses do not represent separate 
divisions of an educator’s task, but separate viewpoints upon it. If it 
prove wasteful to consider the same problem in several different classes, 
perhaps the entire professional training may well be unified around the 
problem, while psychologist, sociologist, administrator, philosopher, 
and historian offer their resources. 

A final word may well be directed to the systematization of learning 
which grows out of such study. Will such a treatment of psychology 
in its natural habitat leave the student without those logica’ divisions 
with subheads and sub-subheads which have brought a sense of 
mastery to his predecessors? If anew approach has brought a mastery 
of the situations in which educators need help, then this systematiza- 
tion may have been sacrified for great gain. However, it may be 
possible to retain both. It has been.found practicable at the end of 
such a series of problem studies to look back over the whole process. 
The business of organizing, from time to time, the principles which 
have emerged, into systems, is not seriously inhibited by the fact that 
the starting point of thinking has been the problem rather than the 
scheme. One happy consequence is that not alone one, but several 
schemes of relationship become possible. Educational psychology 
tends, under such circumstances to develop primarily the percep- 
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tion of significant elements in complicated educational situations 
with the facts of experience which are related to these elements, 
Only secondarily does it become a matter of chapters and divisions. 

In summary, study of the present predicament of educational] 
psychology suggests the following conclusions: 

1. There is probably significant agreement among faculty and 
student members of Teachers College as to the points at which they 
wish psychology to help them. This agreement is not noticeably 
affected by experience in teaching, or in studying educational psy- 
chology. Further study is needed to separate real agreement from 
possible errors due to arrangement of the questionnaire. 

2. The apparent concurrence of opinion would place major empha- 
sis upon problems of emotion and personality adjustment, problems of 
original nature and heredity, and problems of general teaching method. 
Little emphasis would be placed upon the psychology of the acquisition 
of skill, or the psychological problems involved in home-making and 
pre-school child training, or the various schools and theories in 
psychology. 

3. The most-used textbooks do not agree with one another in their 
emphases, but much less do they agree with the criterion based upon 
judgment of value. 

4. Representative educational psychologists rate the problems 
suggested in distinctly different fashion from either the distribution of 
emphasis suggested by other educators or that found in texts. 

5. The writer is of the opinion that the best present procedure in 
the teaching of educational psychology is to follow the order of value 
set by the agreement among educators, meanwhile investigating fur- 
ther the relative difficulty of learning to handle each type of problem, 
the amount of transfer from each topic into educational situations, and 
the form of organization and systematization which will most effec- 
tively extend this transfer. 

6. It should again be emphasized that the selection of problems for 
research should not be curbed by immediate and often short-sighted 
professional demands. It may well be, however, that research which 
endeavors to make a large practical contribution in the near future, 
will want to give special notice to fields here rated high in importance, 
value and interest, within which all too little scientific study has yet 
been made. 
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1The writer wishes to express his indebtedness to C. M. Derryberry, who prepared this table. 
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TaBLE II.—Rank or Topics WHEN RaTED By E1cHtT DirrerRENtT Groups 
x —s oa ~| Rank at end 33 3s é 4 
— 3 | | | me | $2 | 333 
2 |A|B/C|D E\A; C |E/F| 28 | 58 see 
I Original nature....... 4} 3| 2| 1) 2) 21 3}2 | 2/2) 2 i = 
II Personality........../ 3) 1] 1) 2} 1} 1) 1)1 | 1) 1 1 1 3 
III Interests............ 10} 7} 3) 5) 3} 3} 711 | 4 4) 6 6 9 
IV General method...... 1} 2} 4! 3) 41 4) 2) 344| 71 6) 3 3 7 
V_ Special subjects..... ./11)11/10)12)13)13) 815 [12); 9) 11 12 6 
VI Curricula, texts...... 7|12| 5)14; 7; 9|12)14 | 9) 7 8 9 12 
Eee 13)14)13/15)15)15)14)10  (14)11) 15 15 5 
VIII Measurement........ 5} 4) 7) 9) 612) 4,5 | 5) 3) 5 5 1 
IX Differences.......... 2} 8} 8/13) 5) 6) 6| 34%] 3) 5) 4 4 4 
X Extra-curricular...... 7/10)11) 8| 9) 7|10; 8 | 8 8 15 9 8 
XI Inter-relationships | 
and transfer........ 7| 5| 6 6} 8 5) 5) 6 | 610) 7 7 14 
XII Home...............{14| 6/15) 7/14'10/11/ 7 [13] 7) 13 | 11 | 48 
RS i a ac ciakw os | 9} 9)14/10|10) 813)12 1012) 10 13 10 
XIV Physiological........ '15)13] 9| 4/1111 9/9 |11) 9) 12 | 10 | 1 
EV Theotles............ 7 ae awen 1513) 14 | 14 8 
| | | 























TasBLeE IJI.—INTER-CORRELATIONS OF Rank DistrisutTions GIVEN TO SuGGESTED 
Topics BY STUDENT AND Facutty Groups 
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| A|C|E| F | Fac- weighted | cational 
Group | A | B C;D\E II | If | IT | IT} ulty —— compo- | psychol- 
| _ site ogists 
A 64|.72!.75|.78|.90).831.78).78 .83 .81 .85 39 
B |...|...|.59).91].75)|.80).52|.83).83| .68 .88 .86 .48 
Co fe.sj--. 62| .73|.70|.64).54 .60) .31 .58 .68 14 
BP hevoless a baad je ares pee poe .83 . 95 . 94 .46 
E sakes jn aden |. 73) 64.86) .70) .68 .82 .82 .09 
A ...[.+.[+.|-79]-85).81] .76 91 .93 57 
C gheveles sn a .76 .82 53 
E Je..[eeefee fee] 84] 81]  .96 94 45 
ae Se oe oe ee | 4 ae .85 .88 66 
I Ba shen shane clsesl wal . 86 .82 .43 
Unweighted composite|...|... a oe .97 47 
Weighted composite...|...|... ee om 50 
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TaBLE 1V.—CoMPARISON OF TimE ALLOTTED Eacu Topic in Composite Jupc- 
MENT WITH Space ALLOTTED IN TExTs IN Common Usp 























Com- 
posite | Average | Starch (per- Gates (per- Strong (per- 
value edu- centage) centage) centage) 
judg- | cational 
Topic ment psy- 
cri- j|chologist 
terion (per- | 
(per- j|centage)| X | Y |Both| X | Y | Z|! All| X | Y | Both 
centage) | 
I | Original nature.......... 13 12 7| 8 8 u 19; 8] 13 3] 5 4 
Il | Personality adjustments. 15 11 1; 2 1 9' 11; 8 9 5| 3 4 
et Pc coseccevacunsees 7 6 1; 1 1 4, 1° 1| 2 1; 3 2 
IV | General method......... 10 8 6| 7 6 | 11; 19) 27; 19 | 10) 16) 13 
V_ | Special subjects......... 3 9 24| 25) 24 5} 6| 8) 6] 10) 15) 12 
VI | Curricula, texts......... 10 4 12; 8 10 4| 3) 4) 4 1; 1 1 
aS ee 1 10 8 8 8 9} 3) 6 6 6| 3 5 
VIII | Measurement........... 9 13 13} 15) 14 { 10) 9 13) 11 | 21) 15) 18 
IX | Individual differences.... 9 10 11; 8} 10);10| 6 6 7/)| 13) 14 13 
X | Extra curricular......... 6 1 1; 1 1 a» 2 Bes 1; 1 1 
PE ig cacecesscenc ; 7 1 10; 13) 11 5| 5, 4! 5 1; 1 1 
i: pcvedeakens news 2 2 1; 1 1 i; 1} 2} 1 1; 1 1 
XIII | Adult relations.......... 3 3 1; 1 1 > = es 2; 1 2 
XIV | Physiological factors..... 3 4 3} 1 2 | 14) 12; 4} 11 | 24) 20) 22 
XV | Schools and theories..... 2 6 1; 2 2 3} 3) 7| 4 1} 1 1 



































TaBLE V.—RELIABILITY OF ESTIMATES ON TEXTS 


Starchcx ana y) = .95 
Gates x ana Y) . 84 
Gates(x andz) = .52 
Gatesyy anaz) = .73 

Strongx andy) = .91 


TaBLE VI.—INTER-CORRELATIONS OF TIME ALLOTTED Eacu Topic By CRITERION 
AND BY TEXTS 
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| paychologiste Starch | Gates | Strong 
Composite judgment of value.......... A7 .04 47 .02 
Educational psychologists.............. bea . 36 .46 44 
chaise bake eeae hws teed eeknee ‘ere re 21 .34 
DKS saccndnideeacntatecadadéunes re ‘si > .69 























THE MEASUREMENT OF INTELLIGENCE! 


C. 8. SLOCOMBE 


University College, London 
INTRODUCTORY 


Mental tests were first invented and used by Sir Francis Galton 
in 1883. During the following 20 years various researches on their use 
and value were made by other pioneers in the movement in America, 
France, Germany and England. But the tests used were, generally, 
for the measuring of specific abilities and capacities; and largely owing 
to the conflicting deductions of various workers, mental testing fell 
into temporary disrepute. 

However in 1904 Professor Spearman,” as a result of his mathe- 
matical and practical research enunciated the theory that a definite 
function, which he called general intelligence, existed, and that not 
only was it capable of measurement, but that all previous mental 
tests had, in part, really been measurements of this function. Unfor- 
tunately though most psychologists have implicitly adopted the theory 
of Professor Spearman and regarded many of the tests they have con- 
structed as intelligence tests, nevertheless—possibly because not until 
recently has Spearman put the mathematical proof of his theory in 
such a form as would satisfy all his critics—they have not proceeded 
upon any scientific principles in constructing their tests nor have they, 
in general, applied any infallible criteria to their tests to ascertain their 
true value. Now that Spearman has not only clearly defined the 
principles which should underlie test construction,*® but also deduced 
mathematical criteria applicable to the tests to determine their value 
as measures of intelligence, it is to be expected that a science of intelli- 
gence testing will replace the present intuitive art. 

According to Spearman’s theory ‘“‘each ability of a person is capable 
of resolution into two factors: (a) a general one entering into all the 
abilities of the person, and (b) a factor specific to the particular ability 





1 Part of a thesis approved by the University of London for the degree of Ph.D. 
The research work was done at University College, London, under the direction 
of Professor C. Spearman. 

2Spearman, C.: General Intelligence, Objectively Determined. American 
Journal of Psychology, 1904. 

3 “ Nature of Intelligence and Principles of Cognition.” 1923. 
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eoncerned.”’ An intelligence test may be regarded as a measure of 
general intelligence, and of a number of specific abilities. Obviously 
the smaller the extent to which the latter enter into the process, the 
greater is the test as one of intelligence. As these latter are, by 
hypothesis, specific their total effect will, if a sufficiently large number 
of test-forms are applied, be zero. 

So that true general intelligence may be defined as that which 
would be measured by an infinite number of independent forms of 
test (in which the so-called noegenetic processes are crucial) each of 
infinite length, 7.e., each containing an infinite variety of material. 
Independent forms of test are those in which there is no intercorrela- 
tion of the specific factors. This general intelligence would not be 
capable of exact measurement. The practical consideration is the 
construction of such tests as will give as close an approximation as 
possible, and the determination of the principles underlying such 
construction. (Analogies, synonyms, etc. are regarded as forms of 
test or test-forms. A test is regarded as composed of one or more 
test-forms, and possible repetitions of these.) 

Having then defined general intelligence (at least implicitly) it 
is necessary to consider the accuracy with which tests so far con- 
structed and in use do measure this function. It seems impossible to 
discover their efficiency in respect to absolute general intelligence as 
defined, but it is possible to obtain their efficiency with respect to the 
perhaps more limited intelligence which they do profess to measure, 
1.é., with respect to their ‘‘hypothetical general factor.”” The measure 
of the accuracy will be the degree of correlation between each of the 
tests used and a supposed infinity of similar tests, z.e., between each 
of the tests and the hypothetical general factor, hereafter called g. 
“These values, the correlations of the different tests with the hypo- 
thetical general factor g itself measure what has been called the 
‘intellective saturation’ of the tests; the degree in which excellence at 
them indicates pure ‘general ability’ or physiologically speaking, the 
degree in which they depend on the energy of the whole cortex.’’! 


Tue Erricrency oF SoME WELL-KNOWN INTELLIGENCE TESTS 


In a recent paper by W. S. Miller? a table of intercorrelations is 
presented which appears to offer a means of obtaining the intellective 





1 Mental Tests of Dementia. Journal of Abnormal Psychology, 1914. 


* Miller: Variation and Significance of IQ’s. Journal of Educational Psy- 
chology, 1924. 
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saturation of some well-known American tests. From Miller’s 
table, by the use of the formula! 
2 S(raz?ay) 
T ag —= aA. 
S(rzy) 


the intellective saturation of these tests has been calculated, with the 
results shown in Table I. 





TaBLe I.—SHowina INTELLECTIVE SATURATION OF NINE AMERICAN TEstTs 





' Length | Number 
Name of test intellective in of forms 


saturation . ‘ 
minutes | included 





Miller mental ability, form A............... .89 19 3 
ee ee . 84 20 6 
EE .83 varies 

ee .94 22 8 
Ne es ch dkb a DbSn de nde berks .93 16 7 
ES EE TEE re .94 27 10 
i Ab Aon ted on ibis: 80 a 4% OR Ka .83 35 3 
EE EEE TO EE TT Ee .93 42 10 
Pressey senior classification A............... .94 16 4 














In considering the above table as offering evidence as to the rela- 
tive efficiency of the tests, two important reservations must be made: 
(a) The intellective saturation of a test varies with the group to which 
it is applied. The group to which these tests were applied comprised 
university high school freshmen, and it is obviously unfair to regard 
the Stanford-Binet test as unsatisfactory because of its relative ineff- 
ciency as applied to these students, when it is intended for application 
to defective or borderline children. (6b) It has been found that the 
efficiency of a test tends to vary with its position in the series of tests— 
that generally the later in the series it be placed the higher will appear 
its value. Hence it is possible that the high value of the Pressey test 
as shown is due to the fact that it may have been the last test presented. 

Neglecting these considerations, a study of the above table reveals 
the fact that the relative efficiency of the tests seems to be strikingly 
independent of (a) the time devoted to it, and (6) the number of 





1Loc. cit., Mental Tests in Dementia. 
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different forms contained in it. It may be mentioned that in the 9 
tests considered there are 30 different forms of test used, of which 7 
are pictorial, and 23 verbal. If now the intellective saturation of the 
five best tests combined be calculated, it is found to be .99 (PE + .002). 
The five tests combined are Army alpha, Illinois, Terman, Otis, and 
Pressey; and the calculation of the combined correlation is effected 


by the formula,’ 7 
o°Ta 
ria; tae+... ag = pane = .99 


If this be taken at its lowest likely value, 7.e., less (5 PE = .01), the 
intellective saturation, .98, still appears to be quite satisfactory. 
But what is the precise meaning of this figure? With what degree of 
accuracy do these five tests combined measure g? 

This may be determined by considering the coefficient of aliena- 
tion, or “‘standard error of estimate of a second variable, knowing the 
first.’’? 

Standard error of estimate of y from x 


= yV 1 — rey 


where x = known variable = score in combined tests and y = esti- 
mated variable = estimated g score. Then if the regression be 
known, the standard error of an estimated g score from a known test 
score = .20,,whenr(ait+ae+ .. . a;5)9 =.98. Ifthetotal range of 
qg scores be regarded as 60,, then the standard error of estimate would 
be 3.3 per cent. If three times the standard error be taken to include 
all possible errors of estimate, then it may be said that the true g 
score may be estimated with certainty from the combined test score 
with an error not exceeding 10 per cent. In the majority of cases, of 
course, the error will be much less than 10 per cent, but in any individ- 
ual case selected for estimate the error may be 10 per cent, but is 
unlikely to be greater. 

Thus it is seen that even so high an intellective saturation as .98 
involves a 20 per cent range of possible error in measuring intelligence, 
and that such a range of error is involved in the measurement by a 
combination of five good American tests. Thus suppose a subject 
obtains 60 per cent score in the combined tests. We may say with 





1 Spearman: Correlation of Sums or Differences. British Journal of Psychology, 
1913. 


* Kelley, Truman L.: “Statistical Method.” 1923, p. 173. 
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almost absolute certainty that his true intelligence is not less than 50 
per cent and not greater than 70 per cent, on a scale of 60,. 

Taking then this evidence, is the efficiency of present tests to be 
regarded as satisfactory? In view of the hypothetical and intangible 
nature of the function (the very existence of which has been a subject 
of prolonged controversy between foremost psychologists) which 
tests attempt to measure, it certainly may be regarded as very credit- 
able. But it cannot be regarded as satisfactory, in the sense that no 
greater degree of accuracy is desirable or possible. In fact the need 
for improved tests is stressed by nearly all psychologists, but few seem 
to understand how to effect the necessary improvement. 


SuaGEesTtep Metuops oF ImpRoviNnG MEASURES OF INTELLIGENCE 


It appears that the improvement may be achieved in five ways. 

1. By the use of more exact mathematical work. The coefficient 
of correlation may be an excellent measure of the relationship between 
two variables, but every textbook on statistics utters warnings as to 
the deductions to be drawn from them—warnings which are not 
repeated, and are often disregarded in considering the application of 
correlational methods to psychological tests. If in the use of such 
coefficients a rise in correlation from .98 towards 1.00 involves a reduc- 
tion in the range of possible error, in estimating one variable from 
another, from 20 per cent towards 0 per cent, it is obvious that either 
more exact work is required in the calculation of coefficients, or that 
some other mathematical device should be used for interpolation. 

2. By a more scientific and systematic selection of material and 
forms for the construction of tests. It is difficult to discover any guid- 
ing principle determining the selection of present test-forms. Con- 
stant reference is made in literature to reasoning and the higher mental 
processes. But what is reasoning? By what criterion is the height of 
a mental process judged? It may be assumed that the psychologists 
who devised the tests referred to, all used forms which they considered 
involved reasoning or the higher mental processes, and yet 30 different 
forms are present in these 9 tests and no one form appears in them all. 
It would appear that there is no unanimity in the matter, and as Pro- 
fessor Cyril Burt points out, in reference to Binet tests, there is ‘‘need 
for determining by adequate modes of experiment and statistics the 
utility of each individual test (form) as a measure of that function, 
intelligence.”’ Generally the material of the tests, z.e., the words or 
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questions used, appear to be chosen in a somewhat haphazard manner, 
without any attempt being made to include all categories; and no 
adequate experimental data are obtained for determining their 
suitability, efficiency and diagnostic value. 

It would appear however that Spearman’s principles of cognition 
are principles which might usefully be applied in selecting forms and 
material. An analysis of common forms of test, showing their value 
as determined by these principles was undertaken in 1922 by H. 
Perera, but his paper has not yet been published. Perera suggests, 
that a test should include all the various types of relations, attributive 
temporal, spatial, etc., both real and ideal, used with all different types 
of fundaments, sensory, affective, etc., directly presented and repre- 
sented, pictorially, concretely and verbally. Whether tests con- 
structed in accordance with these principles would prove to be of 
maximum value is a matter for experimental demonstration, but they 
certainly offer logical and systematic criteria for application to forms 
and material. 

Dr. C. R. McRae of Melbourne in a recent research made an a 
priori analysis, of the Stanford-Binet tests, using these principles, and 
prophesied the discriminative value of each test as applied to physi- 
cally defective and mentally defective children. The coefficient of 
association between the value found practically and that prophesied 
was found to be .88. 

3. By the determination of the adequate length of test necessary 
to obtain a desired degree of precision in measurement. The efficiency 
of any measure of g is always impaired by the presence of specific 
factors, but the effect of these factors may be reduced in two ways: 
(a) by a careful selection of those forms and material which involve 
mainly the cognitive ‘“‘noegenetic’”’ processes, and (b) by the use of 
tests of sufficient length to eliminate by mutual cancellation the 
effect of specific factors. It would seem that a shorter length of test 
would be sufficient with material of good quality, but that with material 
of a low value a very much lengthened testing would be required. 
g was defined above as that which would be measured by an infinity of 
independent test-forms, each of infinite length. As close an approxi- 
mation as possible is required to this exact measure. But the length 
of any test is a product of two factors, the number of test-forms, and 
the number of questions in each. 

Experimental research only can decide in which way a test is to be 
lengthened—by using a small number of lengthy forms, or a large num- 
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ber of short forms. In practice there is considerable variation, ag 
exemplified in the tests discussed above. For example, the Binet tests 
are composed of a large number of forms each containing a small 
number of questions, usually not more than three; while on the other 
hand the Miller test comprises three forms, there being from 40-50 
questions in each. It is also necessary to determine the relative 
value of the different forms (probably for different groups of subjects) 
with different material, for it may be found that the material most 
suitable for use in one form is not suitable for use in another. 

4. It may be found that the number of forms with a high intellec- 
tive saturation is limited. If the number is small it will be necessary 
to include inferior forms, but to weight the scores in the better forms. 
Hithero the value of weighting does not appear to have received much 
consideration. In all tests the component forms have different values, 
but are nevertheless given equal weight in the combined score. In 
this way measures which are known to be relatively inexact are regarded 
as having the same value as the more exact ones, and thus the effi- 
ciency might be increased by weighting the scores in each component 
form by multiplying them by a factor involving the square of the intel- 
lective saturation of the form, before summing the scores in all forms. 
This procedure will not increase the value of a poor form, but will 
serve in some degree to minimize the effect of the factors conducive 
to poorness, in the total test score. 

5. Though not usually included in the length of tests as discussed, 
the length of time devoted to fore-practice is a very important factor 
in determining the intellective saturation of tests. The matter has 
been investigated, and it is hoped to publish the results later, but here 
it may be said that the efficiency of test-forms and tests is increased 
by adequate fore-practice, though the proportion of time to be devoted 
to fore-practice requires fuller investigation. The beneficial effects 
of fore-practice have been stressed by both Spearman and Thorndike. 

Reference has been made so far mainly to verbal or linguistic 
tests, but the underlying principles of test construction should be of 


- universal application. And apparently the intelligence measured by 


performance tests is the same as that measured by verbal tests. Miss 
F. Gaw who has recently investigated this question finds that “the 


_tests (performance) which correlate best with general intelligence— 


as measured by the Binet scale—also correlate best with central 
‘performance’ capacity. This seems to furnish strong corroborative 
evidence that the central factor measured by performance tests is 
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largely general intelligence.”! The matter has not yet been definitely 
proved, and if such difference is found then it would be necessary to 
include some non-verbal material in all tests of general intelligence. 

It has always been found that performance tests or such as have 
been used, have shown very little intellective saturation. It would 
seem however that a low correlation with g is not inherent in these 
forms of test and this concrete material. The application of definite 
principles in their construction would, it is suggested, result in much 
more efficient tests. In particular it seems that their low value is 
due (a) partly to the very small number of questions in each form, 
and (b) to the lack of fore-practice in the methods, and use of con- 
crete material of a special type. 


SUMMARY 


1. General intelligence is defined, from the view-point of the 
mental tester, and the efficiency of tests as measures of it is considered. 

2. The efficiency (intellective saturation) of nine American tests 
is compared, and it is shown, by the application of the coefficient of 
alienation, that even a combination of the five best tests involves a 
possible 10 per cent error in measurement. 

3. Five methods are outlined by means of which measures of 
intelligence may be improved. 





1Gaw, F.: A Study of Performance Tests. British Journal of Psychology, 
1925, p. 387. 
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AN EMPIRICAL VIEW OF INTELLIGENCE 
RUDOLF PINTNER 


Teachers College, Columbia University 


During the present period of controversy over the meaning of 
intelligence and the interpretations of intelligence testing, it seems to 
the writer that the practical worker in intelligence testing is in need of 
an empirical view of intelligence that will help him to steer a safe 
course between the Scylla of philosophical discussion on the one 
hand, and the Charybdis of statistical interpretation on the other. 
The philosophical argument starts with the attempt to set up some 
theoretical definition of intelligence and then to force the actual tests 
themselves to fit into this definition, and the chances are that the tests 
will not fit. At any rate we are certain to be in conflict with other 
definitions of intelligence. The interpretation of our test results, on 
the other hand, leads some people to argue that intelligence is a 
fixed inherited characteristic of the individual, unalterable by environ- 
ment, but it leads others to a diametrically opposite conclusion. Let 
us, therefore, see whether we can approach our problem from another 
point of view and reach at least an empirical view of intelligence and 
intelligence testing, that will be of value to the psychologist working 
with intelligence tests. 

The most profitable approach seems to the writer to be indicated 
by the behaviorist standpoint in psychology. For our immediate 
purpose we are not concerned with behaviorism as a kind of psychology 
or philosophy. We are merely using a behavioristic approach for the 
purpose of arriving at a working philosophy for the mental tester. 

For our purpose, then, we shall consider behaviorism as dealing 
with the observable reactions of the organism. It is possible to imag- 
ine a classification of all human reactions from this point of view. 
Such a classification might be as follows: 

1. Gross muscular responses of the arms, legs, trunk, etc. 

2. Finer responses of the sense organs, eyes, ears, etc. 

3. Finer muscular responses of the larynx, fingers, etc. 

4. Internal or glandular responses, etc. 

Any such classification we might make is not crucial for our pres- 
ent discussion. We merely make it in order to observe that 
nowhere, in such a behavioristic survey of the responses of the animal, 
do we find any possibility of a group of intelligent or of unintelligent 
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responses, just as we do not expect to find any group of good or bad, 
moral or immoral responses. 

Forgetting now for the time being any of our preconceived and 
former notions of intelligence, we can make the proposal that any 
response of the organism may be used as an intelligence test under 
certain conditions. There is nothing distinctive about an intelligence 
test. The test merely uses a reaction or group of reactions, and any 
reactions which the individual makes may be used. Any set of reac- 
tions may become useful as intelligence tests when they are used to 
differentiate between individuals from a certain point of view. The 
fact that a certain response is designated more intelligent than another 
implies a judgment of value on the part of some other person. The 
judgment is made in the light of some criterion, and this criterion would 
seem to be in general whether the organism attains the end he is 
striving for, or to what degree he attains this end. Intelligence, 
therefore, from the empirical point of view signifies a judgment on the 
part of someone with reference to a specific response. 

The criterion as to whether the individual attains the end he is 
striving for is necessary for our judgments of intelligence. We can- 
not judge as to which is the more intelligent of two men unless both of 
them are striving for the same end. Obviously if one of them is read- 
ing a poem in order to get its meaning, and the other is reading the 
same poem in order to appreciate the rhythm alone, they are not 
striving for the same end, and we could give them no common examina- 
tion upon the poem by means of which we might judge the intelligence 
of the two individuals. From this is apparent the necessity for getting 
the right set in our intelligence tests, particularly in our group tests, 
for if a child sits in the group and does not try as hard as the others or 
does not try at all, it is impossible to get a true evaluation of his 
intelligence. 

Let us now examine further the idea that any reaction may under 
certain circumstances be used as a test of intelligence. Are the gross 
muscular responses of the individual so used? In early infancy the 
age of first starting to walk is significant. If attempts at walking are 
unduly delayed and if there are no physical reasons for such delay, we 
use these walking reactions as intelligence tests. We use them for 
the purpose of evaluating the intelligence of the individuals. And 
so also with many other gross muscular responses of the infant, such 
as climbing over an obstacle, throwing a ball, andsoon. The assump- 
tions underlying our judgments of intelligence from such reactions are 





610 The Journal of Educational Psychology 


that the individuals compared have roughly the same physical develop- 
ment and the same opportunities for gross muscular reaction. We do 
not think of using such reactions as tests in order to compare a crippled 
with a normal infant, or a child who has suffered long and serious ill- 
ness with a child of average health. Only when the background of the 
children is roughly the same, can such reactions be used as tests. The 
point which we are trying to make here, however, is that such reactions 
may be used as intelligence tests, and we are making this point in order 
to emphasize the idea that there are no specific kinds of reactions that 
are peculiarly intelligence reactions. 

In a similar way we might discuss at length the possibility of 
using the various sensory reactions as intelligence tests. Many such 
might be used at certain stages in the growth of the child. The finer 
muscular responses of the larynx and the fingers are frequently used. 
The age of beginning to talk is significant in infants. The ability to 
handle a pencil and draw things has been used asa test. Lifting cubes 
with the hand and sensing the difference in weight is one of the Binet 
tests. More difficult to evaluate and, therefore, less frequently used 
are those general undefined reactions due to glandular disturbance in 
the organism. Emotional reactions are rarely used as indicators of 
intelligence, and yet they are at timesso used. They also may be intel- 
ligence tests. The intensity or amount of rage or joy shown by an 
individual in a given situation may lead the onlooker to infer or judge 
of his intelligence. ‘‘It is stupid to rage like that,” “‘It is foolish to 
show so much anger,” and similar phrases are indicative of our use of 
emotional reactions as intelligence tests. 

It would be useless to proceed further with illustrations attempting 
to show how all reactions under certain circumstances might be used 
as means to judge the intelligence of individuals. It is enough for 
our purpose to point out this possibility in order to emphaize the fact 
that there is nothing inherent in a reaction that makes it, rather than 
another reaction, a test of intelligence. We must get away from the 
idea of thinking that there is something particularly sacred to intelli- 
gence in naming opposites, or finding analogies or tying bow-knots or 
any similar reactions that have become the stock-in-trade of the intelli- 
gence tester. The important thing to note is under what circum- 
stances and for which individuals a specific reaction becomes a test 
of intelligence. 

This point of view clears up, I believe, the difficulty that has arisen 
as to whether certain reactions can or cannot be used as intelligence 
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tests. Stern’s well-known definition of intelligence, as the ability to 
adjust to relatively new situations, has made some contend that only 
novel stimuli may be used as intelligence tests. It has led to a differ- 
entiation between intelligence and knowledge, which is a useful 
distinction, but which may become harmful unless we realize just 
precisely what is meant by the two terms. Old stimuli, 7.e., stimuli 
that have occurred frequently in the past, may under certain circum- 
stances be just as good indicators of intelligence as novel stimuli, that 
is, stimuli that have never occurred before. We may judge intelligence 
from habitual modes of reaction just as well as from reactions to novel 
situations. The reaction to an old stimulus indicates how well the 
organism has learned to adjust. If we know that two individuals 
have been confronted with a given stimulus 20 times previously, we can 
compare their intelligence by means of their efficiency in reacting to this 
stimulus on the twenty-first occurrence. We sometimes use what a 
child has learned in school as an index of his intelligence, comparing 
him with other children who have had similar schooling. These old 
stimuli cannot be used so universally as indicators of intelligence as 
relatively new stimuli, because we cannot tell as accurately in the 
case of such stimuli, whether the children compared have the same 
common background. 

This brings us to the next important consideration with reference to 
the individuals to be compared or judged as to their intelligence. 
When we consider the circumstances which make a specific reaction a 
means for comparing the intelligence of two or more individuals, we 
find that there is an underlying assumption with reference to any 
reaction or group of reactions which we may use as measures of 
intelligence. We do not use age of walking as a test of intelligence in 
making a comparison between a physically normal child and a crippled 
child. The two children do not have the same common background 
with reference to this reaction of walking and, therefore it becomes 
useless as a means for evaluating or comparing the intelligence of the 
two children. Every test assumes a certain common background 
among the individuals tested. Or to put it another way, to use a 
standardized test as a test of intelligence, we must make sure in 
the first place that the individual being tested has roughly the 
same background as that of the children used in the standardization 
of the test. If not, the reaction or group of reactions in question 
cannot be used as a test of intelligence in this particular case. What- 
ever the stimuli may be, they must have roughly the same degree 
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of familiarity or novelty to the individual to be tested, as they have to 
the group by means of which they were standardized. Each intellj- 
gence test assumes a certain background and unless an individual] 
possesses that background, he cannot be adequately judged by means 
of the intelligence test in question. The background of any standard- 
ized test is determined by the group used for standardization. The 
new individual to be tested must be like the standard group. The 
more like the group he is in this respect, the more accurately will his 
intelligence be estimated. The less like the group, the less adequately 
will his intelligence be measured. 

Let us examine for a moment some of the asssumed backgrounds 
of some well-known tests. The Stanford-Binet has been standard- 
ized upon American English-speaking school children. This test 
assumes the background of an American home and an American school. 
This is true of ages 5 or 6 to 14. The tests below age 5 assume an 
American home without nursery school experience. The more the 
child to be tested deviates from this background, the less accurate 
becomes the measure of intelligence. If the child is an American 
child without American schooling, the test is not so valid. This 
will be the case when the test is used on English children with English 
schooling, or on American children of foreign parents in a non-English- 
speaking home environment. And, of course, it becomes obviously 
useless for Chinese children in China, because the background in this 
case is so far removed from the background of the standardization 
group. Or to put it in other terms, there are no specific reactions which 
are always indicators of intelligence. Any reactions may be used, 
provided the background of the individuals to be compared is the same. 

The National Intelligence Test assumes roughly the same back- 
ground as the Stanford-Binet, z.e., American home and school expe- 
rience. And this is true of a great number of similar group tests. 
The Terman Group Test assumes High School or at least Junior 
High School experience. The non-verbal type of test has arisen in 
order to broaden the background and include a wider group of indi- 
viduals for comparative purposes. The Pintner Non-Language Test 
assumes a more or less civilized life (as opposed to a savage or uncivil- 
ized existence) inasmuch as it takes for granted the ability to interpret 
pictures and the knowledge of Arabic numerals. It may, therefore, 
presumably be used for all American children, whether English-speak- 
ing or not, whether with or without school experience, and perhaps for 
most children in civilized communities. Similarly with the pantomine 
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form of the Myers Test. The Army Beta is another example of this 
attempt to widen the common background and put more individuals 
upon a comparable basis. The new Princeton Universal Intelligence 
Test is the most decided attempt to make the background as broad as 
possible so as to include as many individuals as possible. It does not 
assume even the ability to handle a pencil or make marks on a paper. 
In a similar manner we could pass in review the several intelligence 
tests in use today and state the common background assumed for 
each one of them. The important thing to be noted is that the 
standardization group determines the background, so that the new 
individual to be tested must approximate the standardization group in 
background, if we are to use the specific reactions in question as 
intelligence tests. It has recently been found by Woolley that children 
who have been in nursery schools for a year make greater gains in IQ 
than those who have not attended nursery schools. These IQ’s are 
calculated on the Stanford-Binet Scale. We are, therefore, comparing 
three- and four-year-old children with a nursery school background 
with the standardization group of three- and four-year-olds who did 
not have such a background. Hence from our point of view the 
reactions called for by the Binet Test at these ages become poor means 
for comparing the intelligence of the two groups under consideration. 

We may now return once more to consider the value of old or new 
stimuli as tests of intelligence. It will now be obvious why old stimuli, 
fixed habits, or knowledge are less frequently of value as intelligence 
tests. A specific habit or bit of knowledge can only function as an 
intelligence test when the individuals to be compared have had equal 
opportunities of acquiring such habit or knowledge. Hence specific 
tests in school subjects, such as arithmetic, reading, spelling and the 
like, are very rarely used as intelligence tests, although theoretically 
they might be so used, were we assured of a common background among 
the children to be compared. The maker of an intelligence test seeks 
stimuli that are more novel and with reference to which the individuals 
to be tested are likely to be more nearly equal from the point of view 
of having had opportunities to react in that specific way before. Hence 
rather than selecting items to be added or multiplied, in regard to 
which previous experience may vary enormously, he selects a number 
series to be completed, assuming the common background of numbers 
going in series, but presenting relatively novel arrangements of number 
series which presumably have not been learned by the group to 
be tested. Of course, if they have been learned by some individuals 
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and not by others, these items immediately become poor test items to 
compare the groups in question. 

At this point it would be obvious and easy to object that we never do 
have individuals of the same background or that we never can be sure 
that individuals to be tested have had the same opportunities as the 
standardization group. Strictly speaking this is true. And this js 
one of the many reasons why our intelligence tests are merely approxi- 


‘mations to exact measures. If we persisted in the strict logic of the 


argument it would lead us to assert that no two individuals ever had 
exactly identical environments and therefore no stimulus could ever 
be equally novel or equally familiar to both individuals. No two indi- 
viduals ever had identical backgrounds. In this way we might paralyze 
all attempt at intelligence measurement. What we have to do, there- 
fore, is to choose reactions for use as intelligence tests which are as far 
as possible equally novel or equally familiar to the individuals to be 
compared. We can improve our tests by keeping this principle in 
mind in the selection of items or in the selection of individuals to be 
rated by a given test. If we do this, it will prevent us from asserting 
dogmatically that an average IQ of 85 or 90 on the Binet really meas- 
ures the intelligence of Italian-speaking children in this country, or 
that gypsy children possess only the intelligence equivalent to an IQ of 
75 as found by Gordon. The point of view stressed in this article 
raises, with reference to such findings, the more fundamental question 
as to the adequacy of these comparisons in view of the wide discrepancy 
in background between the groups in question and the original stand- 
ardization groups. We must get away from the idea that certain 
stimuli constitute in and of themselves adequate intelligence tests 
on all occasions. We must free ourselves from the idea that there is a 
specific faculty of intelligence. We must remember that intelligence 
is merely an evaluation of the efficiency of a reaction or group of 
reactions under specific circumstances. 

Another device that the test maker uses to allow for the inequality 
of background among individuals is to select many different kinds of 
items. Granted that we can never be sure of the identity of the 
backgrounds of the individuals to be compared, we attempt to make 
comparisons among them by means of multiplying the number of 
reactions upon which we base our judgment of intelligence. If we 
select several different kinds of stimuli, the chances are that we will 
even up the discrepancies in background among the individuals in 
question, and so we find that a number of sub-tests combined into a 
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test is a better instrument than a single type of material. The 
Binet Test itself is a good example of such a miscellaneous collection 
of varied types of material, and we know from experience that it works 
well as a measure of intelligence. Of course this principle of hetero- 
geneity in test material does not mean that any mixture will be a good 
mixture. Each type of material selected must fit the background of 
the individuals upon whom it is to be used. 

It has become customary of late to distinguish between different 
kinds of intelligence, such as abstract, concrete, and social. Such 
distinctions are very useful, but I am afraid there is a tendency among 
intelligence testers to accept these as three faculties, as three distinct 
entities. So now our faculty of intelligence is spilt up into three 
faculties—abstract, concrete and social intelligence. Here, again, the 
important point to emphasize is that abstract intelligence is merely a 
measure of an individual’s efficiency in reacting to what have been 
classified as abstract stimuli, such as words and numbers; that con- 
crete intelligence is merely the efficiency in responding to concrete 
stimuli; and social intelligence his efficiency in reacting to people. 
This division is then merely on the basis of the stimuli which make up 
the tests. There is a decided practical value in having intelligence 
tests based upon different types of material. We must not let our- 
selves imagine that the abstract verbal type of test, such as the Army 
Alpha, is the only type of material by means of which we can measure 
intelligence. General intelligence in the broadest sense of the term 
will probably best be measured by a composite test including all 
types of reactions, abstract, concrete and social. 

This three-fold division into abstract, concrete and social intelli- 
gence is not the only possible division by any means. Indeed, Bridges! 
has suggested another three-fold division into cognitive, affective and 
conative intelligence. By cognitive intelligence he means the capaci- 
ties to acquire ideas and ideational associations. Affective intelli- 
gence includes the capacities to condition, modify and combine the 
feelings and emotions. Conative or motor intelligence includes 
the capacities to coordinate motor responses into habits, to acquire 
technical skills. This classification of Bridges results from his theory 
of personality. The other three-fold division arises from the kinds of 
stimuli that can be used as test situations. 





1 Bridges, J. W.: A Theory of Personality. Journal of Abnormal and Social 
Psychology, Vol. XX, No. 4, Jan., 1926, pp. 362-370. 
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CONCLUSION 


I have tried in this article to approach the description of intellj- 
gence from a new angle without claiming any new definition or theory 
of intelligence. I have not attempted in any sense to explain what 
intelligence is. I have merely given an empirical point of view which 
I believe useful to the mental tester and the maker of tests. I believe 
it useful inasmuch as it very definitely combats a common tendency 
to regard intelligence as a specific faculty of the mind, in spite of our 
modern abhorrence to faculty psychology. I believe it useful also 


because it brings clearly to the mind of the mental tester the necessity 


for deciding whether any individual to be examined can be adequately 
examined by means of a particular test. He must always ask the 
question, ‘“‘Is this individual to be examined comparable to the 
standardization group?” He must not apply the Binet Test to anyone 
that comes along and imagine that he will always get a good measure 
of intelligence by so doing. I also believe it useful because it stresses 
the fact that no particular test item is always under all conditions a 
good stimulus for the measurement of intelligence. The type of 
material used by Binet has demonstrated its efficiency under many 
conditions and hence its great value. Intelligence is demonstrated in 
all our reactions and, therefore, many different types of reactions may 
become useful as measures. 








THE RELIABILITY OF FREYD’S INTEREST ANALYSIS 
BLAN K* 


RUTH M. HUBBARD 
University of Minnesota 


In the January-February 1926 issue of the Journal of Personnel 
Research, the writer’ published results obtained by giving Max Freyd’s 
Interest Questionnaire? to several different groups of freshmen at the 
University of Minnesota. In that article, the writer called attention 
to the fact that no work had yet been done on the reliability or consist- 
ency of Freyd’s scoring method and that any future group or individual 
differentiations on the basis of the questionnaire should be preceded 
by a careful study of the reliability. The present paper attempts to 
supply, in part, such a study. 

The Interest Questionnaire consists of a list of 100 occupations 
and 100 or more other items such as sports, books, etc., each of which 
is followed by the letters L D OU. The subject is to cross out 
L if he likes the item, D if he dislikes it, O if he is indifferent to it and 
U if he knows nothing about it. Freyd found the items which were 
crossed out a significantly larger percentage of times by mechanically 
inclined people, as represented by engineering students, and arbitrarily 
assigned to each ascore of —1. Tothe items crossed out a significantly 
larger percentage of times by socially inclined people, as represented 
by students in life insurance salesmanship, he assigned a score of +1. 
Then each person’s questionnaire was scored by obtaining the algebraic 
sum of his pluses and minuses. Thus a strongly minus score indicates 
an individual with mechanical interests while a strongly plus score 
indicates an individual with social interests. 

In the article, “Interests Studied Quantitatively,’” the writer 
showed that Freyd’s scoring scheme so definitely differentiated con- 
trasting vocational groups that as small a minus score as — 2 indicated 
mechanical inclinations and that +2 or above indicated social inclina- 
tions, leaving a neutral zone from —1to +1. Noclaim has been made 
that these scores show the relative strength of interests but simply the 
direction thereof. 

The Interest Questionnaire is given to all freshmen at the Univer- 
sity of Minnesota along with their intelligence test during registration 


* The writer wishes to acknowledge the assistance of Prof. Donald G. Pater- 
son of the University of Minnesota, under whose direction this work was done. 





617 





H 
_* 
i 

4 

tt 
} 


De 


«eS, eer ee 


Ss 


— ef 
> 


Po AE ae Se BS eR 
ss = oS, ae ee ¥ . i. - 


Fe is ae 
= an 


- 


a 
- 


ee 


ae 


Ps oe a 








=e = 


wt 





' 
i 
} 
| 


618 The Journal of Educational Psychology 





week. For the purpose of studying the consistency or reliability of 
the questionnaire it was later given to the classes in sophomore psy- 


chology and to the classes in freshman rhetoric. 


So the data of this 


paper include: first, original tests and retests a year later for 156 men 
and 193 women who were freshmen in 1924; and second, original tests 
and retests six weeks later for 285 men and 313 women who were fresh- 
men in 1925. All members of the psychology and English classes 
who did not belong to the sophomore and freshman college classes 


respectively, were eliminated. 


TaBLE I.—CENTRAL TENDENCIES AND VARIABILITIES OF GROUPS TESTED 





First test 


Retest 












































ae ramen (Sept., 1924) | (Oct., 1925) 
a ogi Lites weeks ile +2.28 +2.35 
REPRE Sc an ee +2.12 +2 .37 
rr ee ag aa el Ee 2.99 3.25 
ee a ee i eels ee 2.49 2.57 

First test Retest 

S15 ‘Women (Sept., 1925) | (Nov., 1925) 
ag Bs lal eh haa a a Men a aa nl +2.41 +2.24 
Ra rn Cara Se ke ay +2.21 +2 .25 
SPREE ES EUR Se gd ee ne ae 2.96 3.10 
ee ee ee ie terra £er Po erg a) Sol els ee 2.35 2.49 

First test Retest 

106 Bien (Sept., 1924) | (Oct., 1925) 
RET) Cr ree ee ee ee ae + .41 + .79 
ie ati has Ll ow eit th + .45 + .80 
ee A a hte ae Si 3.36 3.78 
a aL ea ae 2.73 2.98 
First test Retest 

285 Men (Sept., 1925) | (Nov., 1925) 
a co eee ee ae Ee Siw aes + .90 + .98 
Ren ee te ae ne ea are eal og + .79 + .62 
Oe a ek Ek a ae 3.55 3.81 
i lel ae ek eS ce se Cad aed eee 2.79 3.08 
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Table I gives the central tendencies and variabilities of the groups. 
The figures agree very well with those found for the groups used in the 
earlier work, which were for 114 science, literature and arts women, 
median +2.3, standard deviation 3.22; for 365 science, literature and 
arts men, median +.81, standard deviation 3.45. The central 
tendencies and variabilities are fairly constant for the two tests on 
each group and since they also agree, on the whole, with the earlier 
comparable figures, we may conclude that our samplings are adequate 
for obtaining typical reliability coefficients for liberal arts freshmen. 


TaBLe II.—RE.LIABILITY COEFFICIENTS 


























Number | Reliability se _PE Diff. 
of cases | coefficient | difference | PE diff. 
a 
Sixwocksinterval..| 818 |¢.a7z.oap}| O18 | -Oe7 | 28 
Men | 
Sisweeksinterval..| 285 |+.624.000;| 22 | -o& | 28 








Table II gives the reliability coefficients for the groups.* The 
men show greater consistency of scores both for the shorter and for 
the longer periods. This may be due in part to the fact that the ques- 
tionnaire was standardized with male subjects and was originated for 
the purpose of differentiating men, with no thought of its use with 
women. 

An unexpected result shown in Table II is that the coefficients for 
the year interval are, in both cases, higher than those for the six weeks 
interval, although the differences are not statistically significant. 
(To be statistically significant the ratios between the difference and 
the probable error of the difference should be four or more.) A priori, 
one would expect the shorter time interval to give a higher reliability 
coefficient since the attitudes registered should change less in six weeks 
than in a year. One suggested source of error in the six weeks coeffi- 





* The reliability coefficient here used is an index of the extent to which the 
Interest Questionnaire consistently yields similar scores for individuals when tested 


' . . ' rxry — 6.6 
at two different times. The Pearson Coefficient of Correlation (+ = Pe 
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was used in computing the correlations reported in this paper. 
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cients is that the questionnaires were given in English classes by the 
English instructors themselves and that owing to the unusualness of 
the situation and the lack of congruity between this material and the 
usual class material, the instructors might have been unable to obtain 
as serious and conscientious cooperation as was obtained in the psychol- 
ogy classes. This source of error would not be a fault of the question- 
naire but would be due to the unfavorable conditions of its 
administration. It follows that psychologists should extend the 
technique of group mental testing to cover what appear to be “‘fool- 
proof devices”? such as this questionnaire might have seemed to be. 
Perhaps we shall soon come to recognize that no psychological test, 
however simple or self-explanatory, can safely be left to the layman to 
administer. 

Now what do the reliability coefficients themselves mean? The 
reliability of our best mental tests ranges between +.80 and +.90 
for heterogeneous groups. Similar reliability coefficients for homo- 
geneous groups (such as college students) stamp the test as a highly 
consistent measuring instrument. Here, the highest correlation even 
for men is only +.64. Does this mean that this method of scoring the 
Interest Questionnaire is too unreliable to have value or are there 
legitimate reasons for low correlations in this case? Immediately 
one thinks of reasons why these correlations must be low. First, the 
test is an attempt to register a supposedly fluctuating aspect of human 
behavior, 7.e., attitudes. Presumably these attitudes of liking and 
disliking various items are subject to violent changes. One might 
expect the liking or disliking reaction to any particular item to shift 
from time to time because of uncertainties present at the time of the 
response. In other words, it is conceivable that many of the subjects 
in reacting to any given item were uncertain as to whether they liked 
or disliked the item, in which case, the decision to cross out L, or D, or 
O would be greatly influenced by chance factors. Hence in reacting 
to the same item at a later date there would be little assurance that the 
same decision would be made because the chance factors would be 
likely to change. Theoretically such chance factors would play a 
much more important role in such decisions, based as they must be on 
“feelings”? more or less ‘‘vague,’’ than they would in conditioning 
responses to the items of an intelligence test where definite ‘“‘knowl- 
edge” or ability rather than ‘‘feelings’”’ are involved. Therefore a 
correlation of +.50 or +.60 is higher than theoretical considerations 
would lead us to expect. Second, the groups used in this case were 


—-_— — -_~_~ gen 2 UceelC Ole. CU CU 








Freyd’s Interest Analysis Blank 621 


comparatively homogeneous; all the students used were registered in 
the colleges of science, literature and arts, education or business. In 
the earlier article,' the writer showed that the business students were 
not separable from science, literature and arts students on the basis of 
mechanical and social inclinations. No work was done on students in 
education but one would expect no great differences between them and 
S. L. A. students. If groups made up of students from the colleges of 
engineering, mines, law and home economics as well as 8. L. A. had 
been used, higher correlations might reasonably have been expected 
since the groups would contain people with such definite mechanical 
and social interests that they had already reacted to them vocation- 
ally. As an example of this, some of the data from the earlier article 
may be combined. Of 827 cases 277 were engineers, 365 were S. L. A. 
men, 84 were pre-business students and 101 were law students. The 
median of this group is —.10 and the standard deviation is 3.99. 
Using the formula given by T. L. Kelley (s = a) for discovering 





TaB.Le III.—SnHowrna CHANGES IN SCORE FROM ORIGINAL Test TO ReErTEsT, 
AMONG MEN, IN RELATION TO THOSE IN THE EXTREME DIVISIONS 











Year Six weeks 
interval interval 
Total number of cases......... | 156 285 





Num-}| Per | Num-} Per 
ber | cent | ber | cent 








Of those who were —2 or less...| 24 56 40 58 | remained —2 or less. 
Of those who were —2 or less...| 11 26 10 14 | moved to between 
| —land +1. 
Of those who were —2 or less... 8 18 19 28 |moved to +2 or 
more. 
Total number —2 or less....... 43 i 69 





Of those who were +2 o0rmore.| 41 67 77 63 |remained +2 or 


more. 

Of those who were +2 or more.| 15 25 31 25 |moved to between 
+land —1, 

Of those who were +2 or more. 5 8 14 11 | moved to —2 or less. 





Total number +2 or more..... 61 . 1 oe 
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the reliability coefficient of a second, similarly obtained group of caseg 
when its standard deviation is known, we obtain +.75 as the reliability 
coefficient. Then in a more adequate sampling of all University 
freshmen our obtained reliability coefficient of +.64 would rise to at 
least +.75. | 

Another and perhaps more significant device for studying reliability 
is that of comparing the changes made in individual scores from original 
test to retest. Tables III and IV give the data for such a comparison 
for the men and women separately. Among the men, by far the largest 
percentage who were in the two extreme divisions remained where 
they were. In all but one case, the smallest percentage moved to the 
division at the opposite extreme. An appreciable number in each 
case moved from near the borderline into the neutral division. Among 
the women, results were not so consistent. Large percentages moved 
from the lower division into neutral ground but few moved from the 
upper division into neutral ground. The women are typically more 
socially inclined than the men (cf. their central tendencies) so —2 is 


TaBLE 1V.—SHOWING CHANGES IN SCORE FROM ORIGINAL TEST TO RETEST, 
AMONG WOMEN, IN RELATION TO THOSE IN THE EXTREME DIVISIONS 








Year Six weeks 
interval interval 
Total number of cases......... | 193 313 





Num-)| Per | Num-| Per 
ber | cent | ber | cent 








Of those who were —2 or less... 5 20 11 34 | remained —2 or less. 

Of those who were —2 or less...| 13 52 15 47 |moved to _ between 
—land +1. 

Of those who were —2 or less... 7 28 6 19 | moved to +2 or 
more. 

Total number —2 or less....... 25 - 32 | 





Of those who were +2 ormore.| 83 73 | 139 72 |remained +2 or 








more. 

Of those who were +2 ormore.| 25 22 45 23 | moved to between 
+land —1. 

Of those who were +2 or more. 5 4 9 5 | moved to —2or less. 














Total number +2 or more..... 113 .. | 193 | 
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TaBLE V.—SHOWING PERCENTAGES OF MEN AND WOMEN WHO CHANGED FROM 
One EXTREME TO THE OTHER, IN RELATION TO ToTaL Numser or Cases 
































Men Women 
Year Six weeks Year Six weeks 
interval interval interval interval 
156 285 193 313 Total number of cases 
Num-/| Per | Num-/} Per | Num-/ Per | Num-} Per 
ber | cent | ber | cent! ber | cent! ber | cent 
43 | 27 69 24 25 13 32 10 | Had —2 or below in first 
test. 

8 4.8) 19 6.6 7 3.5 6 1.5|Had -—2 or below and 
| changed to +2 or above 
| in retest. 

61 42 | 122 46 113 60 193 59 | Had +2 or above in first 
test. 

5 | 3 14 | 4.9| 5 | 2.5) 9 | 2.7|/Had +2 or above and 

changed to —2 or below 
° in retest. 





























relatively a more extreme mechanical score for them and thus one would 
expect less change in that part of the scale than higher up. In all 
cases a very small number moved from one extreme to the opposite 
extreme. So it appears that, considering the extreme divisions as 
outlined earlier, even the individual scores are fairly constant. Again, 
as pointed out earlier, a score is not intended to show relative strength 
of inclination but merely direction —and direction as evidenced by the 
results of Tables III and IV seems to be quite constant, considering 
the expected variability of inclinations. 

Table V shows the percentages of the total numbers tested having 
scores in the two extreme divisions in the first test. It also gives the 
percentages out of the whole number in each group, who changed from 
one extreme to the other. In other words, it shows how great is the 
likelihood, in a rather homogeneous sampling of students, that extreme 
individual scores will radically change. Of these groups of from 150 to 
300 cases, no more than 7 per cent of any group moved from the 
‘“‘mechanical”’ division to the “social,’’ and no more than 5 per cent 
moved from the “social” to the “mechanical.” So although there 
are enough changes in individual ranking in the group to bring correla- 
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tions down to +.50 and +.60, still there is relatively little change of 
individuals from division to division. Thus the scores made are 
fairly permanent and significant as indicators of direction of inclination. 


SUMMARY 


1. The problem of the reliability of Freyd’s scoring method for 
his Interest Analysis Blank was attacked in two ways—by means of 
reliability coefficients and by studies of changes made in individual 
scores. 

2. The reliability coefficients were—for the men +.64 and +.52 
for year and six weeks intervals respectively; for the women +.49 
and +.47 for year and six weeks intervals respectively. 

3. The reliability coefficients for shorter time intervals were lower 
than for longer time intervals due possibly to more favorable conditions 
of testing for the latter. | 

4. Although the correlations appear low when compared with the 
reliability coefficients of intelligence tests, yet they are significant 
when considering such variable traits as inclinations and when obtained 
on such homogeneous groups as these. 

5. By comparing individual scores in original test and retest, a 
large percentage of those who scored in the extreme divisions of the 
seale at the first testing, were found to remain in the same divisions 
at a later retesting even though the time interval was slightly more 
than a year. 

6. Very small percentages ranging from less than 2 per cent to 
7 per cent were found to move from the “social” division to the 
‘“‘mechanical”’ or vice versa. 

7. Therefore the Freyd scoring method of the Interest Question- 
naire is reliable enough to justify its further use in differentiation. 
Not only have group differences been found significant? but even indi- 
vidual scores appear to be both significant and surprisingly permanent. 

8. The question of its reliability cannot be considered settled till 
more work with even more cases has been done and till the results of 
this study have been confirmed by experiments on other groups, in 
other localities and using longer intervals of time. 
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ON THE INADEQUACY OF THE PARTIAL AND 
MULTIPLE CORRELATION TECHNIQUE 


BARBARA STODDARD BURKS 


Stanford University 
(Continued from the November Issue) 
Part II. In DETERMINING COMMON AND UNIQUE Factors 


Part I of the present paper dealt with certain limitations upon the 
use of the partial correlation coefficient in the study of causation, 
and showed wherein partial and multiple correlation techniques had 
been used in studies of causation far more widely than logic would 
warrant. 

Part II points out another kind of misuse to which the partial cor- 
relation coefficient has often been subjected. The discussion con- 
cerns the type of pitfall that besets the user of partial technique 
who would interpret his findings by such a conclusion as: “‘ This partial 
correlation coefficient between traits 1 and 2 represents the degree 
of association between 1 and 2 after all they hold in common with var- 
iable 3 has been eliminated.” Such an interpretation is usually only 
one out of an indefinite number of interpretations all consistent with 
the data at hand; and there is often no reason for selecting it in pref- 
erence to any of the others. 

The conclusions to be presented in this paper resulted from an 
attempt by the writer to reach an understanding of the mechanics 
underlying partial correlation. A simple experiment was devised 
in which the interrelations of the factors determining a correlation were 
postulated. The result was highly revealing and somewhat amazing 
as well, since the principles that were demonstrated are practically 
never recognized in the applications of the partial correlation tech- 
nique. The experiment is reproduced because it offers a good empiri- 
cal approach to the general considerations. 

Let three variables be intelligence, ability in German, and ability in French. 
Let us designate them as X, Y and Z, respectively and record them as deviations 
from their own means as z, y and z. Now let us postulate that these variables 
are built up as the sums of component variables which are all independent of 


one another (i.e., uncorrelated). Writing the component variables as well as 
X, Y and Z as deviations from their means, we may let 


r=at+be+ctdt+et+ft+gth 

y=at+b+c+d +i+jt+k+l 

z=a+bict+d +m+n+o0+p 
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Thus we see that z, y and z have factors a, b, cand din common. The raw 
correlations between the three variables can be found by the ordinary product. 
moment formula, 


Substituting for Zzy in the numerator its value in terms of the component 
factors of x and y, we have, 


try = Tatb+c+dt+et+ft+gthatb+c+d+i+j+k+)) 
= L(a? + b? + c? + d? plus cross products equal to zero) 


Cross products are equal to zero because the component variables are uncor- 
related. 


Dividing by the N in the denominator, this reduces to, 


, ra? 
a? + a? + o-? + oa?, since 7 * oa” etc. 


It remains to determine the values of ¢; and a2. By an easily derived formula 
involving the standard deviations of any number of uncorrelated factors entering 
into a given variable, 


1? = aa” + on? + a6? + oa® + a6” + a7? + 0,7 + on? 
and 


O22 = oa” + oo? + oe? + oa? + a5? + 0;? + 0%? + al? 


To simplify computation, let us set the standard deviations of the component 


factors A, B,C . . . P equal to one another, and call them all equal toc. This 
simplification will not interfere with our generalization. 
4o? 4 
Then fon @ Tang * Tug 2 "| 8B - @ .50 
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Now since y and z contain no factors common to each other which are not 
also contained in z, the correlation between y and z after the factors common to 
z have been eliminated is zero. 

But applying the ordinary partial correlation formula to our intercorrelations 
between X, Y and Z, 

‘ _ 50 — 25 | 
— ae 7 
We might conclude from this result that German and French held a definite 


proportion of factors in common which were independent of intelligence, although 
by hypothesis this is not true. 


33 


The question immediately comes up as to how the notable difference 
arises. Many of us have believed that in applying the formula to 
correlation data that we were accomplishing statistically what we 
accomplished mechanically by deleting the common factors. The 
explanation is simple. Holding a complex trait like X constant is not 
the same thing as holding constant (or deleting) in Y and Z all that is 
contained in common with X. For the constant trait is in some indi- 
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viduals due to large amounts of the factors common to Y and Z (i.e., 
A, B, C and D) and correspondingly small amounts of independent 
factors; in other individuals it is due to small amounts of the factors 
common to Y and Z and correspondingly large amounts of the inde- 
pendent factors. Thus, even on a constant level of trait X, the 
combination of factors A, B and D would still have variability, and 
a residual correlation between traits Y and Z would be a necessity. 

A little further analysis assures us that including the type of 
relationship between variables postulated in the example above, there 
are at least four main types of pattern which can represent the inter- 
correlations entering a partial correlation. The number of possible 
elaborations and combinations of these is indefinitely large; and in 
only one of the types are we justified in interpreting our partials as if 
everything contained in common with the variable held constant has 
been eliminated. The types are: 


(1) Traitl =a+b 
Trait 2 =a +c 
Trait 3 = a +d 


Here a represents a common factor (or group of factors which 
behave as a unit as the group a, b, c and d in the hypothetical example 
of intelligence, French and German). 


(2) Traitl =a+b 
Trait 2 b+e 
Trait 3 = c+d 


Here there is no common factor running through all three traits. 
We have the curious situation of two traits giving zero correlation with 
each other, but each correlating decidedly with a third trait. 


(3) Traitl =a+b+ec 
Trait 2 =a 
Trait 3 = b 


In this type, which is really a special case of type 2, spurious nega- 
tive correlations may arise between variables having little or nothing 
in common when another variable to which they both contribute is 
partialled out. If, for example, a group mental test is devised which is 
considerably affected by speed of handwriting as well as by intelligence, 
and if performance on this group test is partialled out from the negli- 
gible correlation between an individual test like the Binet and speed of 
handwriting, the correlation between Binet and speed of writing 
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becomes negative. This is because to attain a given level on the 
group test, the pupils who can write fast need less intelligence, and the 
pupils who have more intelligence need less speed of writing. In this 
case we have postulated certain relationships between the traits 
but when we actually obtain such correlations there is no way of telling 
whether that hypothesis fits, or whether speed of writing really does 
contain factors that are in antipathy to Binet intelligence. 


(4) Traitl =a 
Trait2 =a+e 
Trait 3 = a +d 


Here trait 1 consists of just one factor (or group of factors behaving 
as a unit with respect to traits 2 and 3). This is the one type of situa- 
tion in which the partial correlation actually frees variables 2 and 3 
from all they contain in common with variable 1. 

Just a few examples from the literature should provide ample illus- 
trative material. 

Gates and LaSalle! furnish an instance wherein material quite 
probably falling into type 1 or 2 has been manipulated as though it 
belonged in type 4. These writers attempt to test Spearman’s two- 
factor theory with data upon the intercorrelations between a number 
of achievement tests in various school subjects. If the correlations 
between these tests are due to a general factor contributing to all of 
them, then, write Gates and LaSalle, “it should follow that by elimi- 
nating from the correlation between 1 and 2 the association due toa 
third function which gives about equal correlation with them, the 
results would be zero, or approximately zero correlation. By means of 
partial correlation it is possible to make such eliminations. . . ” 
Carrying out the suggested procedure, such typical partial correlations 
are found as .40 between arithmetic and spelling with reading rate 
rendered constant. While the authors cautiously admit that the func- 
tions tested are too few and the factors of various sorts are too complex 
to make them more than suggestive, they conclude that “there is 
apparently no single common factor running through all of these tests 
which accounts in a complete way for the relations among them.” It 
is clear, however, that with data in which none of the variables could be 
viewed as representing the uncontaminated hypothetical general 





1 Gates, A. I. and LaSalle, Jessie: The Relative Predictive Value of Certain 
Intelligence and Educational Tests. Journal of Educational Psychology, Vol. 
XIV, 1923. | 
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factor, and in which the variables have been selected as having approxi- 
mately equal intercorrelations, partials of zero magnitude would be 
not only unexpected but impossible. An analysis of them by partial 
correlation can neither establish nor disprove the Spearman theory. 

Passing next to a study by Jennie B. Wyman of the influence of 
interest on school achievement,' the following quotation is pertinent: 

“The correlation between intelligence and achievement total when 
the effect of intellectual interest is eliminated is .753. The correla- 
tion between intellectual interest and the achievement total when the 
effect of intelligence is eliminated is .490. These correlations give a 
comparative measure of the effect of these two functions on success as 
measured by achievement in school subjects.”” (The raw values of 
the correlations were .817 and .629 respectively. ) 

In the first place, the generalizations formulated in the discussion 
of causation in Part I of the present paper apply here. The status of 
intellectual interest as a possible cause or effect or interacting force is so 
uncertain as to lend a good deal of ambiguity to any partial correla- 
tion coefficient utilizing it. In the second place, if by “effect” is 
meant merely the degree of distinctive association between achievement 
and each of the two other traits, it is impossible that both the correla- 
tions quoted should represent this. While all the factors contained 
in both intelligence and interest could conceivably enter into achieve- 
ment, all the factors in interest could not enter intelligence and at the 
same time all the factors in intelligence enter interest unless the corre- 
lation between these two traits were unity. Thus not more than one 
of the two partials could be of type 4, and we have no way of knowing 
that even one of them is. 

Finally it will be of interest to cite Cole’s study? in which the partial 
correlation between amount of Latin studied in High School and first 
year grades in college Spanish with intelligence constant is presented. 
Cole interprets his findings as indicating that a relationship of .24 
exists between amount of Latin taken at High School and Spanish 
performance independent of what either owes to intelligence. This 
may be the case, but the data do not demonstrate it, since it is quite 
possible that the measure of intelligence contains among a large number 
of factors some that are specialized in the direction of language ability 





‘Wyman, Jennie B.: Tests of Intellectual, Social and Activity Interests. 
In Genetic Studies of Genius, Vol. I, 1925. 

? Cole, L. E.: Latin as a Preparation for French and Spanish. School and 
Society, Vol. XIX, 1919, pp. 618-22. 
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and some which are not. In case it does, Latin and Spanish would 
have a residual correlation after intelligence was partialled out even 
though all the factors contained in both were contained in intelligence, 


CONCLUSIONS 


1. There appear to be four main types of pattern which can repre- 
sent the inter-relations of variables entering a partial correlation. 

2. Of these, the only type in which two variables are actually free 
from all they contain in common with a third variable rendered con- 
stant is that in which the third variable contains just one factor (or 
group of factors behaving as a unit with respect to the first two 
variables). 

3. The partial correlation technique has frequently been used in 
research where only controlled experimentation would yield valid 
results; for to justify its use in most situations of the type described 
would require a knowledge of the inter-relations of variables which it is 
the purpose of the partial correlation coefficient itself to uncover. 
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THE GESTALT THEORY IN EDUCATIONAL 
PSYCHOLOGY 


ARTHUR I. GATES 


Teachers College, Columbia University 


A review of The Growth of the Mind by Kurt Koffka, New York; 
Harcourt, Brace & Co., 1924, pp. XVI +375. Translated by R. M. 
Ogden and Psychology and Education, by R. M. Ogden, New York; 
Harcourt, Brace & Co., 1926, pp. XII + 364. 


That the Gestalt psychology may have new and definite signifi- 
cance when applied to the problems of educational psychology is 
indicated by the choice of this field for exposition of the new theories by 
an outstanding student, Koffka, by the nature of the research work of 
another eminent student, Koehler and by the acceptance of the 
doctrines by an American specialist in educational psychology, Ogden. 
Koffka’s Growth of the Mind, a treatise on educational psychology, was 
the first systematic account of the Gestalt theory to be translated into 
English and now the translator, Odgen, has produced another text 
based upon the same general doctrine. The time has arrived, mani- 
festly, when serious students of educational psychology should care- 
fully appraise the new explanatory conceptions which, having arisen 
rapidly in Germany within recent times, have been spreading with even 
greater rapidity abroad. 

In both of the books before us, two general topics, the character- 
istics of instinctive activity and the nature of processes of learning are 
treated in detail as a means of illustrating and applying the Gestalt 
principles. 

Both Koffka and Ogden accept the instinct hypothesis, at least in 
the sense of affirming ‘complex adjustive behavior that does not 
require to be first learned before it can be employed” (Ogden). Both 
attack vigorously, Koffka at much greater length, the Spencerian 
doctrine of instinct conceived as a “chain of reflexes.’”” Odgen spe- 
cially stresses the belief that an instinct is an indivisible unity preceding, 
biologically, the more definite, fixed ‘‘touch and go” type of act, the 
reflex. This writer conceives that there are two original types of 
persistence activities, attraction and avoidance. Later were evolved 
more definite types of activities, characterized primarily by what 
Lloyd Morgan termed “persistency with varied effort,” and taking 
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the form either of attraction or avoidance. These activities are the 
instincts. Still later, more definite and fixed reactions, characterized 
by little or no ‘varied effort” appeared. These are the reflexes. 

An instinct is conceived by these writers as a unified activity, a 
“continuous movement,” a temporal “configuration,” explicable only 
in terms of ‘“‘unclosed”’ and “closed” physiological or mental systems, 
The instinct “does not appear as a multiplicity of separate movements 
but as one articulate whole embracing an end as well as a beginning.” 
Instinctive activity set off by a stimulus persists with varied effort by 
virtue of a type of tendency until the goal or “end situation” is reached. 
The essence of the persisting tendency is to be found in the Gestalt 
principle of “closure.’”’ The instinctive activity in the transitional] 
stages of varied effort is an ‘‘unclosed”’ system or Gestalt character- 
ized by persistent action until the system or figure is “‘closed.”” When 
we ask for details concerning the nature of ‘‘closure,’”’ we are told that 
a “concrete example will best explain what we mean. A soap-film is 
produced upon a wire-frame and upon it a little noose of thread is 
cast in whatever form it may take. If one proceeds carefully the 
thread will be supported upon the surface of the film, but if one pricks 
the film inside the noose, the surface will break apart and the thread 
will be pulled out by the surface-tension of the outer portion of the 
film which seeks to give the area outside the thread the least possible 
surface, and the area circumscribed by the thread the greatest possible 
surface. As a result the thread immediately assumes the form of a 
circle. In this example we can conceive of circularly as the ‘end situa- 
tion,’ puncturing the soap-film as the stimulus releasing the movement, 
and the movement itself as the ‘transitional situation.’ ’’ Instinctive 
activities are of this type. The source of the persistent activity once 
the instinct is released is the unclosed system which somehow strives 
toward closure into a simple configuration. ‘“‘An explanation of 
instinctive behavior is, therefore, not called upon to discover an 
inherited system of connected neurones, but rather to investigate what 
kinds of psychico-chemical ‘closure’ produce these astonishing types 
of behavior and under what conditions.” 

This new school of psychology assumes that behavior begins not 
with a chaotic group of meaningless sensations or number of uncor- 
related motor reactions but, on the contrary, with unitary, articulate, 
meaningful wholes to which are applied the terms Gestalt, configura- 
tion or structure. Just as motor acquisitions begin with integrated 
patterns of response, the instincts, so perceptual learning begins with 
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unified, meaningful mental structures. Learning consists in a 
transformations of these configurations, mental and motor. The 
various directions which these metamorphoses may take or the 
several results they may produce are described (by Ogden) under 
such terms as particularization, differentiation, assimilation, gradation, 
and the like. Whatever the direction or result, learning proceeds 
gradually by transformations, and, as if “by a number of leaps 
and bounds” (Koffka) different motor configurations, perceptive 
patterns, meanings, insights are achieved. Progress is the result not 
of a gradual process of addition and elimination of ‘‘bonds” or 
reactions, not of gradual acquisition by trial and accidental success, 
but of a series of metamorphoses. 

When, now, we ask how these transformations are brought about, 
we are again introduced to the principle of closure. Much as the 
“unclosed”’ system sustains instinctive activity until the configuration 
is “closed,” so learning is activated and perpetuated by an “incom- 
plete,” ‘‘unstable,” “‘unclosed” condition or system. Fundamental 
to learning is the disposition of such systems in the human organism— 
as in other phases of nature—to “‘become stabilized,” ‘‘succeed in 
filling a gap,”’ ‘‘ become as perfect as the prevailing conditions admit,” 
“achieve a simple state.’”’ These are expressions of the dynamic 
aspects of the “universal law of Gestalt.” 

Reflexes, instincts, percepts, memories, insights, emotions and all 
native or acquired motor and mental acts are thus conceived as unified, 
indivisible gestalten in space or time. But this is not the whole story. 
Situations and events in the physical world are also configurative, that 
is, so unified as to be unaccountable as a sum of parts. Conscious 
phenomena, consequently, are not explained as due to the arousal of 
separate areas of the cortex by means of neural connections since the 
brain like other physical organizations is not a collection of parts or 
activities but a Gestalt. Brain action, in other words, is configurative 
action. Each of these statements, taken alone, is intelligible, but the 
alleged relations or parallelism between the several Gestalten are not. 
The human response to the external configuration is somehow pro- 
duced by a pre-existing parallelism or harmony between two (or more) 
configurations. The argument is of two types: Either the mental 
configuration is somehow impelled to ‘‘close”’ into the simple configura- 
tion existing in its counterpart, the Gestalt in the environment, or the 
external figure, the nervous system and mental acts are all parts of 
one all inclusive configuration which tends to “close” as a whole. In 
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either case the human being is subject to the universal pressure to 
‘fill the gap”’ and thus bring the Gestalt to perfection. ‘If the system 
can yield to this pressure, then the result is achieved” (Koffka). If 
this account of behavior strikes us at first as highly fanciful or even ag 
invoking mysterious potencies and spirits, it should nevertheless be 
studied in its detailed application to provide a greater and perhaps, a 
more promising, insight. Space is available here for examination of 
few detailed accounts. 

One issue given great attention by all Gestalt writers concerns 
insight itself. The general theory of trial and error learning and 
especially Thorndike’s descriptions of animal achievements are the 
subjects of sharp and extended criticism. It is contended largely on 
the basis of Koehler’s extensive investigations that the principle of 
trial and error fails utterly to account for much of the learning of ani- 
mals and man and that the prevalence of sudden and ‘“‘intelligent” 
acquisitions during the process impels us to accept “insight” as a 
fundamental feature of learning. While an adequate review of this 
discussion should include both a presentation of important distine- 
tions between the principle of trial and error as explained for criticism 
by Koffka and the principle as conceived by others and a review of the 
limitations of Koehler’s studies, space is here available only for a few 
brief comments on the assumption of insight. Insight is an instance of 
a sudden mental transformation, a sudden success in “‘filling a gap” in 
a mental configuration, a sudden achievement in “yielding to a pres- 
sure.” ‘‘We call the organism that succeeds in filling the gap clever”; 
it displays insight. 

However true or reasonable such statements may be, they are 
incomplete explanations. The achievement of the learner is named— 
“insight’’—but it is not explained; it is not even described except in a 
most vague way. With the principles of situation and reaction, varied 
activity (trials) made possible by constantly shifting positions and 
inner conditions of the active learner, response by analogy, accidental 
success both in movement, perception and mental activities during 
recall, Thorndike at least observes the principle of parsimony and 
affords an explanation of learning which may be applied to and there- 
fore tested by the observed occurrences in solving mechanical puzzles, 
following a maze, learning the abc’s, acquiring conceptions of abstract 
facts such as triangle or heat, seeing into a problem of geometry and 
the like. The Gestalt conception of insight solves the problem only 
by placing a verbal tag on this mass of most subtle activity. 
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In most of their writings, exponents of Gestalt psychology deliver 
heavy blows at the traditional ‘‘mind dust” theory of sensationism 
and at the association theory of learning. In the first instance, the 
plows mainly strike thin air since scarcely more than a ghost of the 
objects of attack remain among us. In the second conflict, worthy 
opponents aplenty are to be found, though perhaps few will profess 
the associationism assaulted. In this attack the Gestalt exponents 
assume, or seem to assume, that the conception of association, in any 
form, is incompatible with the unified, integrated, configurative 
response. Since reform is often more fruitful than destruction, it will 
doubtless be worth while for students to consider whether the inte- 
grated types of response may not be explained in the familiar terms of 
stimulus and response. 

The stimulus-response or reaction hypothesis as conceived particu- 
larly by Thorndike, Woodworth and others (are usually considered a 
form of association psychology) seems to have no difficulty in account- 
ing for a complex, unitary response to a plurality of stimuli. In 
treating illusions, perception of the ambiguous figures (visual assem- 
blages of dots, lines, etc., that take on different appearances from time 
to time) and other stimuli, Woodworth very clearly indicates the fact 
that ‘‘several stimuli, acting together, arouse a unitary response.” 
The varying perceptive patterns to the same external situation is 
explained indeed by enlarging the aggregate of stimuli to include intra- 
organic factors such as muscular and neural conditions of intelligible 
sorts. That figures, relations, chords and the like are unitary percep- 
tual responses has long been affirmed by these writers and others and 
the explanations offered are as yet neither obviously quite inadequate 
nor, perhaps, fundamentally incompatible with the essence of Gestalt 
psychology. At any rate, the rich body of experimental and reflective 
data produced by the Gestalt psychologists affords the materials for 
a new test of the explanatory principles familiar to American students 
and, at the same time, the treatment of integrated and unitary reac- 
tions by the stimulus-response groups merits the careful study of the 
exponent of Gestalt. 

In the present formulations, the Gestalt conceptions leave much 
to be desired in definiteness and precision. They seem still to be lack- 
ing in exact implications. If accepted it is not at all apparent what 
corollaries these formulations would yield concerning the control of 
the developing child in typical situations, methods of teaching the 
alphabet, arithmetic, manners, or desirable reforms of the school 
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curriculum. The books under review are strikingly deficient in appli. 
cations of the Gestalt doctrine to practical problems in education and 
of those offered (more by Odgen than by Koffka) few, if any, seem 
necessary and unique deductions from the main theories or unlike 
those already defended as applications of familiar forms of association 
or stimulus-response psychology such as that of Thorndike. The 
Gestalt formulations as yet strike one as very general characterizationg 
of behavior rather than exact statements of a complete series of prin- 
ciples adequate for specific guidance of scientific generalizations, 
It should be appreciated, however, that the exponents of Gestalt have 
not as yet had ample time to work out all the implications of their 
theory. Perhaps in time, greater definition will be achieved. 

Whether or not the Gestalt theories will prove to be fully satis. 
factory principles, they will doubtless exert in time an influence upon 
current psychological interpretation. Though not wholly new—the 
notion of unified structures is found in the teachings of James, the 
“creative synthesis” of Wundt, the ‘‘objects of a higher order”’ of 
Witasek, the perception of verbal units in reading in the investigations 
of Cattell, the treatment of perception and learning by Woodworth, 
etc.—the Gestalt conceptions of the configurative character of mental 
reaction will give a new emphasis to the physiological interpretation 
from which much good may result. Even if the new teachings do 
not revolutionize psychology they may, at least, reform it considerably. 

Since but little of the space allotted for this review remains, it will 
be devoted to a brief characterization of the two books. 

The volume by Koffka is written with extraordinary clarity, an 
achievement for which the translator, Ogden, doubtless deserves great 
credit. This book deals almost exclusively with criticism of rival views 
and exposition of the Gestalt hypotheses. The near absence of the 
educational implications of the latter theories may be a disappoint- 
ment to many readers. The extensive criticisms of familiar theories 
of instinct, learning, intelligence, insight, memory, perception, associa- 
tion, attention, etc., are exceedingly suggestive materials for the 
appraising student. There are in fact, many who feel that these 
criticisms exceed the author’s positive contributions in acuteness and 
in promise for reform of current thought. The books also contains 
accounts of a large number of highly important researches that will 
be unfamiliar to students who have not kept abreast of research in 
Germany during the last decade. 
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Ogden’s book is both a companion and a supplement to that of 
Koffka. It devotes very little space to criticism or to accounts of 
technical studies. Though by no means lacking in original suggestions, 
it attempts primarily to explain the Gestalt theories and to suggest 
their educational implications. It is a more elementary and briefer 
book than Koffka’s and contains much more material, such as classifica- 
tion of particular instincts, of the type found in most current texts 
in educational psychology. Though written with obvious care for 
exactness of expression and illustration, it is by no means light reading; 
it is a book for serious study. 

That the two best systematic accounts of Gestalt psychology in 
English should appear on the shelf for educational psychology will be, 
it is hoped, sufficient incentive to secure widespread and careful con- 
sideration by students of education of these new and promising doc- 
trines. Until abundant time for reflection, discussion and further 
experiment has been provided, any confident appraisal of these new 
theories would be premature. That such able men as Koffka, Koehler, 
Wertheimer, and Ogden are devoting themselves to tests of the Gestalt 
principles is the source of great satisfaction. 
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Wuat WE Know ABoutT SUPERIOR CHILDREN 


Gifted Children: Their Nature and Nurture, by Leta S. Hollingworth. 
New York: The MacMillan Company, 1926. Pp. XXIV + 374. 


This very pleasing book tells in a compact and readable way what 
is known today about gifted children. Professor Hollingworth 
arbitrarily defines a gifted child as one having an IQ of 130 or over 
which includes approximately the best 1 per cent of children. With 
true scientific temper the author reports only those facts and relation- 
ships that have been verified by experimental method and objective 
measurement, supplemented here and there with valuable comments 
resulting from her own observation of gifted children. This book isa 
very satisfactory resumé of the work done on gifted children in recent 
years and its reading should provide an antidote to the often times 
maudlin overemphasis on the needs of unfortunate deviates. 

The topic is approached historically. A chapter is devoted to 
stating the number and distribution of gifted children. Following 
this the physique, character, temperament, and interests of gifted 
children are described. A chapter is devoted to the description of 
development of gifted children. The specialization of ability is dis- 
cussed in another chapter. One of the most interesting chapters, 
and one which describes gifted children most vividly is that which gives 
case studies of certain children with exceptionally high IQ’s. The last 
chapters have to do with the education of the gifted and a general 
rationale of the topic. 

An interesting feature of the book is the photographs which show 
the superiority of the gifted in height to normal children of the same 
age. These photographs leave vivid impressions that no table or 
graph cando. To the uncritical, however, these photographs may leave 
the false impression that almost all gifted children are above average 
height as the particular gifted children who have been photographed 
are all taller, with one exception, than the average child of the same age. 
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In a book of this kind various things may interest different readers. 
I was particularly interested in the discussion of the advantages and 
disadvantages of rapid progress and of segregation. Professor Holling- 
worth presents arguments in favor of segregation and then shows that 
with a segregated group who are self sufficient socially, rapid progress 
is the more defensible. 

Some may be interested in Professor Hollingworth’s reasons why 
gifted women tend to avoid child-bearing. To label as selfish such 
arguments as “‘For a gifted woman to have children means that she 
must incur pain, a certain statistically determined risk of health or of 
life, injurious restriction of activity, and inhibition of the strong self- 
assertive drives which are called personal ambition” may not be very 
convincing. But if arguments must be based on motives which appeal 
strongly to men (as opposed to women), it would seem to me that for 
an intelligent woman no motive would be as strong as that of applying 
all of the skill, the care, the artistry, and scientific knowledge one 
possesses to the upbringing of sons and daughters who would be recog- 
nized as worthy products of a mother’s care. This task, to be well 
done, makes demands on technical knowledge and skill and yields a 
product such as are found in few activities open to men. Iam sure no 
intelligent man would consider the avoidance of pain, the avoidance of 
loss of health, the freedom of activity, or the gratification of modern 
forms of competitive ambition to be more desirable than the satisfaction 
that comes from painting a great picture, writing a great book, building 
a bridge, or discovering the cause of disease. 


PrercrvaL M. SyMonps. 
Teachers College, Columbia Univ. 





FREEMAN SURVEYS THE FIELD oF MENTAL TESTS 


Mental Tests, by Frank N. Freeman, Chicago: Houghton Mifflin Co., 
1926. Pp. IX + 503. $2.40. 


As Freeman mentions in his preface, though mental tests have been 
in use for the past 70 years, there has not yet appeared any book which 
has attempted to survey the whole field. This book does so, and brings 
together a very large body of material. One hoped that it would 
satisfy a long felt want—but though possibly intended to do so—it 


fails utterly in this respect. The crying need is for a rational coordi- 
nating exposition. 
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That is to say, the whole movement is in a state of unrelated chaos, 
and the book reflects that condition. While it includes isolated dig- 
cussion and explanation of tests, methods, age scales, educational, 
vocational and personality tests, as well as more technical and thes: 
retical arguments regarding test-building, norms, mental growth, 
variability and the nature of intelligence, it accepts the lack of unity 
as a phase through which the movement must slowly pass. It should 
criticize it as a stage that was not inevitable, and that certainly should 
have been passed long since. 

To particularize. The author agrees essentially with Spearman’s 
Two Factor Theory, though he derives his conviction from the con- 
sideration of different data. Having adopted the theory, he should 
view the subject from that angle, and accept and use fully the statis- 
tical procedures involved. He does not—hence his failure. 

For example. He does not recognize that a general intelligence 
test is so built that the composite score is not a measure of the central 
factor plus the specific abilities, but that (at least in a good test) the 
latter cancel out. They can be measured, but not by general intelli- 
gence tests. So he does not relate special tests to general tests, and 
accepts uncritically such absurd ratios as AQ’s, which prove on analysis 
to be devoid of meaning. Again in discussing the constancy of IQ, we 
find (p. 345) “‘the correlation after an interval of time is not much less 
than when the retest is given immediately.” It is the study of that 
not much less which reveals just why the IQ (Binet) is not exactly con- 
stant, and never can be. Exact statistical attacks on these and similar 
problems such as variability, and the limit of mental growth would 
show the utter futility of the major portion of all research in these 
subjects. But we read (p. 240) “‘we must abandon the demand which 
rests on mathematical considerations,” and realize the lack which is 
characteristic of the book, and of the test movement. It is the failure 
to realize that clear definition and exact statistical work are essential 
to @ progressive science. 

Thus the book epitomizes the present situation. It is a queer 
assembly of such elementary things as a sample intelligence test and 
technical arguments, which are so involved that the author makes 
some false deductions. Many sections are inadequate, many are 
excellent. It is valuable as a reference text if studied very critically. 
It misses completely a great opportunity. 


C. S. SLOCOMBE. 
Lincoln School of Teachers College, New York. 
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An ATTEMPT AT INTEGRATION OF ACADEMIC AND PRACTICAL 


Practical Psychology, by Edward Stevens Robinson, New York: The 
Macmillan Co., 1926. Pp. XII + 479. 


The author has made an “effort to integrate psychology with the 
issues of the work-a-day world”’ in the interests of ‘‘the great majority 
of students who spend one quarter or one semester in a general course 
in psychology but who do no further work in this field.”” He has chosen 
a middle course between a purely academic systematized treatment, 
and a purely functional text which would make the starting point and 
organizing principle of the study, the important needs of life. The 
material in this text has been chosen and arranged primarily under 
the influence of the typical academic course in psychology, but the 
author has introduced at very many points illustrations comprehensi- 
ble and interesting to those whose interest is outside of the field of 
psychology. The principles are stated with such care as to accord 
with the best recent scientific study, and illustrated in terms of the 
common, if not the important, concerns of every day life. 

The reader who glances through the book to note the chapter 
headings will perhaps be disappointed at the conventional character 
of the topics which meet his eye. Part I offers a preparation for 
psychology including a study of the subject-matter, methods, uses of 
psychology; the nervous system, sense organs, muscles, and glands. 
Part II discusses reflexes, habit formation, habit fixation, habit elimina- 
tion, and the operation of habits. Part III under the heading “ Per- 
ception”’ deals with attention, and the varieties of perception in terms 
of the number of senses involved. Ideas, concepts, memory, imagina- 
tion, and reasoning fall in Part IV. Feeling is the basis for Part V, 
and the individual, his personality and measurement of his abilities 
complete the list. Beneath these conventional headings there is a 
simple and clear statement of the gist of psychological thinking on the 
topic, with many of the illustrations which would occur in the dis- 
course of a popular professor. 

Part II is so effective and concise a discussion of the general char- 
acteristics of learning that students in educational psychology might 
well read it to obtain an easy background for more detailed analysis. 
Valuable suggestions for effective study, for original thinking, and for 
valid reasoning may be found in the chapters headed “‘ Memory,” 
“Imagination,” and “Reasoning.” The time given to the study of 
perception is much more difficult to justify in a course with such a 
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_ purpose. The author apparently found it almost impossible to pro- 
duce in this realm, illustrations which have any import in life. Perhaps 
here, as at a few other points, long service in the bonds of academic 
psychology would not permit the discarding of material which wag 
really unnecessary. Why else include in such a practical text a dis. 
cussion which must conclude (p. 378) “Since it is impossible to draw 
' a sharp line between feeling and cognition, these exceptions to the 
general rule that exteroceptors give rise to cognitions and that pro- 
prioceptors and interoceptors give rise to feelings need not cause any 
great amount of worry.” 

The book is in good form. Outlining is clear and logical. Each 
chapter begins with certain questions which form the set for the 
student while reading the chapter. At the end of each chapter excel- 
lent summaries are offered, stating in language which freshmen can 
understand, the high points of the discussion. Then a series of prob- 
lems and exercises is suggested, and the chapter concludes with a 
short list of references for further study. 

The book is clearly a useful step in a desirable direction. There 
may be fewer naps in undergraduate courses introducing psychology, 
because this text is available. GoopwIN B. Watson. 

Teachers College, Columbia University. 


A BritisH Text, Ricu 1n ILLUSTRATION 


Mental Life: An Introduction to Psychology, by Beatrice Edgell. Lon- 
don: Methuen and Co., 1926. Pp. XVI + 275. 7s 6d. 


British psychology, as seen in the systematic treatises of Ward, 
Sully, and Stout, presents a characteristic traditional approach that 
appears once more in Miss Edgell’s book. Her work has been enriched 
by new influences from America and the Continent, but the principles 
of her predecessors still furnish the core. 

“The treatment is genetic, and the outstanding features of impul- 
sive behavior, emotion, and adolescence are illustrated from studies of 
child and animal life and from autobiography.”’ So the jacket of the 
book declares. The author’s problem is to show how the structural 
organization of mind has developed by new meanings and values which 
are superimposed upon those which are primitive for the human being. 
Chapter II on the analysis of mental life and experience aims to ac- 
quaint the student with the categories of cognition, affection, and 
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conation. Miss Edgell’s very readable style has certainly helped 
her attain this object, although it must be confessed that the chapter 
appears out of place at this point. A brief, yet adequate, survey of 
nervous functions follows; the selection of topics and their treatment, 
however, varies between extremes of simplicity and abstruseness. 
The problem of endowment is handled in the section on “ Primitive 
Values” in which she distinguishes reflexes, selective bodily adjust- 
ments, impulses (conation), and instincts and appetites. Out of 
this issues a study of the fundamental processes of intellectual growth 
which she claims are retention and attention, selection and integra- 
tion, and consciousness of likeness and consciousness of difference. 
The chapter on learning unfortunately is weak, possibly because the 
author prefers to use the term in a very restricted sense; she makes a 
very spirited protest against using the word “‘habit”’ for performances 
other than those which have become automatic through repetition. 
The topic of ideation is treated rather fully and includes descrip- 
tions of eidetic imagery and ecphorized engrams. But it is in her 
discussion of emotional attitudes that Miss Edgell’s literary skill 
appears to advantage. Shand and McDougall furnish the principles, 
and the standard English novels the illustrations, for her main theses. 
The chapter on adolescence is vividly interesting and indicates a 
remarkable understanding and appreciation of juvenile motives. 
The final chapters on the formation of the sentiments, and on con- 
duct and character, are admirable for their lucidity and for the manner 
in which the ethical and philosophical implications have been revealed. 
A brief appendix on the body-mind problem appears to have been an 
after-thought and not an especially happyone. Miss Edgell’s book will 
hardly serve as a text for introductory courses in psychology, but its 
richness of concrete detail and reference make it attractive and service- 
able to the general reader. EDWIN Maurice BaILor. 
Dartmouth College. 





Wuat EvuRoOPEAN EXPERIMENTAL SCHOOLS ARE DOING 


New Schools in the Old World, by Carleton Washburne, in collaboration 
with Myron M. Stearns. New York: The John Day Company, 
1926. Pp. 174. 


Mr. Washburne has very commendably attempted to popularize 
some of the ideas back of progressive educational tendencies by giving 
in this book his impressions of a dozen progressive schools scattered 
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throughout Europe. He does not intend this to represent a compre. 
hensive, scientific investigation, nor an evaluation of the experiments 
he describes. He does not recommend specific ‘“‘methods,” nor 
announce, ‘‘ This is the ideal way,” but, with truly admirable lack of 
prejudice for ‘‘Winnetka”’ ideals, he points out the trends which 
characterize the efforts of those fearless revolutionists in European 
education who have had “the courage of their convictions, and have | 
dared to overturn all traditions in their efforts to give children a touch | 
of freedom, a glimpse of beauty, a chance to create, and a breath of | 
real independence.”’ | 

Although he is most favorably impressed by those experiments | 
which seem to combine ‘‘freedom and spontaneity with routine and 
the achievement of definite goals,’”’ this man, who is known for his | 
“statistical analyses,’’ can also see the good in the results of “‘the 
method of humbly and lovingly muddling along with children.”’ 

What can the New Schools of the Old World teach us? Every 
page of this stimulating and readable little book turns a relentless 
light on the ‘weary formality” of the typical American School—on 
its “heartbreaking failure to combine life and school satisfactorily.” 


ANNA SHUMAKER. | 
| 
| 
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